Open ggenny opened 2 years ago
It mainly because we use iou as target of objectness, you could reduce self.gr to avoid this issue.
By the way, my testing result of yolov7 is as below.
can you draw the box thinner so we can have a quick comparison between yolov3 and v7 @WongKinYiu ? As I can see, yolov7 is still worse than v3 for this case.
Yes, to handle this case, the model need to be retrain by above mentioned suggestion.
@WongKinYiu should we need to change this parameter from
hyp.scratch.yml
obj: 0.7 # obj loss gain (scale with pixels) obj_pw: 1.0 # obj BCELoss positive_weight iou_t: 0.20 # IoU training threshold
No, self.gr is set in train.py.
okay i have another q, sorry its noob question
hyp['cls'] = nc / 80. 3. / nl # scale to classes and layers
nc=no of classes nl = model.model[-1].nl # number of detection layers (used for scaling hyp['obj'])
what is 80 mentioned for? (its coco 80 classes) should we need to change this no for custom trained model having different class no ?
Yes, coco 80 classes. You do not need change it for custom dataset.
Try training your model on crowd dataset
Yes, coco 80 classes. You do not need change it for custom dataset.
For the custom dataset, I use yolov7 to train the object detect model. What I need to do is just the following 3 steps ?
data/custom.yaml
file, eg:# COCO 2017 dataset http://cocodataset.org - first 128 training images
# Train command: python train.py --data coco128.yaml
# Default dataset location is next to /yolov5:
# /parent_folder
# /coco128
# /yolov5
# download command/URL (optional)
#download: https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: /home/epbox/AI/dataset/pf/pddModelByFlash/v0.0.3/images/train/
val: /home/epbox/AI/dataset/pf/pddModelByFlash/v0.0.3/images/val/
test: /home/epbox/AI/dataset/pf/pddModelByFlash/v0.0.3/images/val/
# number of classes
nc: 6
# class names
names: ['scratch','crack','leakage','membrane','wiredscreen','blurredscreen']
Change the nc: 80 # number of classes
in cfg/training/yolov7.yaml
to nc: 6 # number of classes
Choose train mode
From scratch with p5
python train.py --workers 8 --device 0 --batch-size 32 --epochs 100 --data data/custom.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7_flash_p5 --hyp data/hyp.scratch.p5.yaml
Transfer learning with yolov7_training.pt
python train.py --workers 8 --device 0 --batch-size 32 --epochs 100 --data data/custom.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '/home/epbox/AI/pre_weights/yolov7/yolov7_training.pt' --name yolov7_flash --hyp data/hyp.scratch.custom.yaml
Just use lower --conf 0.10
or --conf 0.025
and YOLOv7 will still have more Correct predictions and less Wrong predictions than YOLOv2/3/4/5/...
Main issue, you are only looking for True Positives in 1 image, while you don't take into account other million images where there are no people, but YOLOv2/3/4/5/... will predict people, so there will be more False Positives. This is why we measure accuracy on ~20000 MS COCO test-dev images using an AP-metric that considers True/False Positive/Negatives for all possible confidence thresholds.
As @WongKinYiu said, YOLOv7 uses IoU(pred_box, target_box)
as target
for objectness
so loss-function
obj_loss = nn.BCEWithLogitsLoss(predicted_objectness, 1.0)
obj_loss = nn.BCEWithLogitsLoss(predicted_objectness, (1.0 - IoU(pred, target))
so it reduces predicted Confidence ScoreTherefore there are 2 ways:
model.gr = 1.0
https://github.com/WongKinYiu/yolov7/blob/main/train.py#L288I just ran prediction using YOLO v2, v3, v4, v7, v7-e6e, v7(thresh=0.025), v7-e6e(thresh=0.025) And you can see that v7(thresh=0.025), v7-e6e(thresh=0.025) predicts almost all persons (and several backpacks, handbags, cell phones) and have more correct predictions and less wrong predictions than previous versions v2-v4.
For all models I use 640x640 resolution, while for v7-e6e I use 1280x1280.
YOLOv2: darknet.exe detector test data/coco.data cfg/yolov2.cfg yolov2.weights -thresh 0.25 -ext_output crowd.png
YOLOv3: darknet.exe detector test data/coco.data cfg/yolov3.cfg yolov3.weights -thresh 0.25 -ext_output crowd.png
YOLOv4: darknet.exe detector test data/coco.data cfg/yolov4.cfg yolov4.weights -thresh 0.25 -ext_output crowd.png
YOLOv7 conf 0.25: python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --device 0 --source crowd.png --view-img
YOLOv7-e6e conf 0.25: python detect.py --weights yolov7.pt --conf 0.25 --img-size 1280--device 0 --source crowd.png --view-img
YOLOv7 conf 0.025: python detect.py --weights yolov7.pt --conf 0.025 --img-size 640 --device 0 --source crowd.png --view-img
YOLOv7-e6e conf 0.025: python detect.py --weights yolov7.pt --conf 0.025 --img-size 1280--device 0 --source crowd.png --view-img
It mainly because we use iou as target of objectness, you could reduce self.gr to avoid this issue.
By the way, my testing result of yolov7 is as below.
Is there a recommended setting value for self.gr ?Thanks.
same question
It mainly because we use iou as target of objectness, you could reduce self.gr to avoid this issue. By the way, my testing result of yolov7 is as below.
Is there a recommended setting value for self.gr ?Thanks.
same Q, what is the recommended value for model.gr ? setting the value too high or too low to suite an image would produce poor results on other images
Hi @glenn-jocher !
I found an interesting issue and I tested it with YOLOv5. Recent detection models predicting localization accuracy for detection scores have weak performance in case many objects are overlapped.
YOLOv5x conf 0.25
YOLOv5l conf 0.25
YOLOv5m conf 0.25
YOLOv5s conf 0.25
YOLOv5n conf 0.25
YOLOv5x conf 0.05
YOLOv5l conf 0.05
YOLOv5m conf 0.05
YOLOv5s conf 0.05
YOLOv5n conf 0.05
I am evaluating the performance on different image ( people crowd ), the detection abilities on people are significatly lower than v3, as an example this image ( using web demo ):
[YOLOV3] ./darknet detector test cfg/coco.data ./cfg/yolov3.cfg ./yolov3.weights
YOLOV7 [ Web Demo, same as yolov4 ]
Original image:
is a neck definition problem ?