WongKinYiu / yolov7

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
GNU General Public License v3.0
13.33k stars 4.2k forks source link

Very poor result on object crowd, same as yolov4 #429

Open ggenny opened 2 years ago

ggenny commented 2 years ago

I am evaluating the performance on different image ( people crowd ), the detection abilities on people are significatly lower than v3, as an example this image ( using web demo ):

[YOLOV3] ./darknet detector test cfg/coco.data ./cfg/yolov3.cfg ./yolov3.weights image

YOLOV7 [ Web Demo, same as yolov4 ] yolov7

Original image: 144037989-efaa1de8-e24d-488f-99c5-f9d8fd69bed5 (1)

is a neck definition problem ?

WongKinYiu commented 2 years ago

It mainly because we use iou as target of objectness, you could reduce self.gr to avoid this issue.

By the way, my testing result of yolov7 is as below. image

trungpham2606 commented 2 years ago

can you draw the box thinner so we can have a quick comparison between yolov3 and v7 @WongKinYiu ? As I can see, yolov7 is still worse than v3 for this case.

WongKinYiu commented 2 years ago

Yes, to handle this case, the model need to be retrain by above mentioned suggestion.

akashAD98 commented 2 years ago

@WongKinYiu should we need to change this parameter from

hyp.scratch.yml

obj: 0.7 # obj loss gain (scale with pixels) obj_pw: 1.0 # obj BCELoss positive_weight iou_t: 0.20 # IoU training threshold

WongKinYiu commented 2 years ago

No, self.gr is set in train.py.

akashAD98 commented 2 years ago

okay i have another q, sorry its noob question

hyp['cls'] = nc / 80. 3. / nl # scale to classes and layers

nc=no of classes nl = model.model[-1].nl # number of detection layers (used for scaling hyp['obj'])

what is 80 mentioned for? (its coco 80 classes) should we need to change this no for custom trained model having different class no ?

WongKinYiu commented 2 years ago

Yes, coco 80 classes. You do not need change it for custom dataset.

spacewalk01 commented 2 years ago

Try training your model on crowd dataset

Digital2Slave commented 2 years ago

Yes, coco 80 classes. You do not need change it for custom dataset.

For the custom dataset, I use yolov7 to train the object detect model. What I need to do is just the following 3 steps ?

  1. Create the data/custom.yaml file, eg:
# COCO 2017 dataset http://cocodataset.org - first 128 training images
# Train command: python train.py --data coco128.yaml
# Default dataset location is next to /yolov5:
#   /parent_folder
#     /coco128
#     /yolov5

# download command/URL (optional)
#download: https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip

# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: /home/epbox/AI/dataset/pf/pddModelByFlash/v0.0.3/images/train/ 
val: /home/epbox/AI/dataset/pf/pddModelByFlash/v0.0.3/images/val/
test: /home/epbox/AI/dataset/pf/pddModelByFlash/v0.0.3/images/val/

# number of classes
nc: 6

# class names
names: ['scratch','crack','leakage','membrane','wiredscreen','blurredscreen']
  1. Change the nc: 80 # number of classes in cfg/training/yolov7.yaml to nc: 6 # number of classes

  2. Choose train mode

python train.py --workers 8 --device 0 --batch-size 32 --epochs 100 --data data/custom.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '/home/epbox/AI/pre_weights/yolov7/yolov7_training.pt' --name yolov7_flash --hyp data/hyp.scratch.custom.yaml
AlexeyAB commented 2 years ago

Just use lower --conf 0.10 or --conf 0.025 and YOLOv7 will still have more Correct predictions and less Wrong predictions than YOLOv2/3/4/5/...

Main issue, you are only looking for True Positives in 1 image, while you don't take into account other million images where there are no people, but YOLOv2/3/4/5/... will predict people, so there will be more False Positives. This is why we measure accuracy on ~20000 MS COCO test-dev images using an AP-metric that considers True/False Positive/Negatives for all possible confidence thresholds.

As @WongKinYiu said, YOLOv7 uses IoU(pred_box, target_box) as target for objectness so loss-function

Therefore there are 2 ways:

  1. Preferably re-train YOLOv7 with model.gr = 1.0 https://github.com/WongKinYiu/yolov7/blob/main/train.py#L288
  2. Or just run detection with 2x, 5x or 10x times lower confidence threshold

I just ran prediction using YOLO v2, v3, v4, v7, v7-e6e, v7(thresh=0.025), v7-e6e(thresh=0.025) And you can see that v7(thresh=0.025), v7-e6e(thresh=0.025) predicts almost all persons (and several backpacks, handbags, cell phones) and have more correct predictions and less wrong predictions than previous versions v2-v4.

For all models I use 640x640 resolution, while for v7-e6e I use 1280x1280.


YOLOv2: darknet.exe detector test data/coco.data cfg/yolov2.cfg yolov2.weights -thresh 0.25 -ext_output crowd.png

yolov2


YOLOv3: darknet.exe detector test data/coco.data cfg/yolov3.cfg yolov3.weights -thresh 0.25 -ext_output crowd.png

yolov3


YOLOv4: darknet.exe detector test data/coco.data cfg/yolov4.cfg yolov4.weights -thresh 0.25 -ext_output crowd.png

yolov4


YOLOv7 conf 0.25: python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --device 0 --source crowd.png --view-img

yolov7


YOLOv7-e6e conf 0.25: python detect.py --weights yolov7.pt --conf 0.25 --img-size 1280--device 0 --source crowd.png --view-img

yolov7_e6e


YOLOv7 conf 0.025: python detect.py --weights yolov7.pt --conf 0.025 --img-size 640 --device 0 --source crowd.png --view-img

yolov7_conf_0_025


YOLOv7-e6e conf 0.025: python detect.py --weights yolov7.pt --conf 0.025 --img-size 1280--device 0 --source crowd.png --view-img

yolov7_e6e_conf_0_025

PearlDzzz commented 1 year ago

It mainly because we use iou as target of objectness, you could reduce self.gr to avoid this issue.

By the way, my testing result of yolov7 is as below. image

Is there a recommended setting value for self.gr ?Thanks.

sadimoodi commented 1 year ago

same question

It mainly because we use iou as target of objectness, you could reduce self.gr to avoid this issue. By the way, my testing result of yolov7 is as below. image

Is there a recommended setting value for self.gr ?Thanks.

same Q, what is the recommended value for model.gr ? setting the value too high or too low to suite an image would produce poor results on other images

developer0hye commented 1 year ago

Hi @glenn-jocher !

I found an interesting issue and I tested it with YOLOv5. Recent detection models predicting localization accuracy for detection scores have weak performance in case many objects are overlapped.

YOLOv5x conf 0.25 crowd_people

YOLOv5l conf 0.25 crowd_people

YOLOv5m conf 0.25 crowd_people

YOLOv5s conf 0.25 crowd_people

YOLOv5n conf 0.25 crowd_people

YOLOv5x conf 0.05 crowd_people

YOLOv5l conf 0.05 crowd_people

YOLOv5m conf 0.05 crowd_people

YOLOv5s conf 0.05 crowd_people

YOLOv5n conf 0.05 crowd_people