AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.75k stars 7.96k forks source link

Yolo-v3 performs better than Yolo-v4 #6079

Open opcap opened 4 years ago

opcap commented 4 years ago

I'm using Yolo with pretrained weights for object detection on low very low-quality, low resolution grainy CCTV footage. I'm trying to detect/count the number of people in a frame.

I have noticed that in my case (where I'm using very low resolution grainy pictures) Yolo-v3 performs much better and finds many more "person"s as Yolo-v4. I have tried many different thresholds and settings, but Yolo-v3 clearly finds objects much better than Yolo-v4

My understanding was that Yolo-v4's MaP score was way higher and thus it should be better at finding objects, not worse. Is there any reason why this is happening? My hunch is that Yolo-V4 might just be worse at detecting stuff from low-resolution images, but haven't ran enough tests to make an informed claim.

I blurred an example photo here and in this example you can see how Yolo-V3 performs better. Though in my example (where I use my own CCTV footage that I'm not allowed to share here) the difference is like day and night: yolov3_vs_yolov4

To run the test detection I used the following commands for Yolo-V3 and Yolo-V4 respectivelty

./darknet detect cfg/yolov3.cfg ~/yolov3.weights passengers.jpg 
./darknet detect cfg/yolov4.cfg ~/yolov4.weights passengers.jpg 

I ran both commands with stock pre-trained weights and default cfg configurations

Here is the original photo that you can use to reproduce this: passengers

Thanks for your hep and thanks for maintaining this awesome project!

AlexeyAB commented 4 years ago
  1. Use width=416 height=416 in cfg-file for small images like your: YOLOv3 (416x416) - 2 errors (1 FN + FP) YOLOv4 (416x416) - 2 errors = (2 FN)

YOLOv4 (416x416) results: predictions

  1. If you want to detect on blurred images - then you should train your models on blured images or set blur=1 in cfg-file

  2. Read: https://medium.com/@alexeyab84/yolov4-the-most-accurate-real-time-neural-network-on-ms-coco-dataset-73adfd3602fe?source=friends_link&sk=6039748846bbcf1d960c3061542591d7

There can always be an image where one algorithm will work poorly, while another algorithm will work well and vice versa. Therefore, for testing detection algorithms a large set of ~20,000 images and 80 different types of objects (MSCOCO test-dev dataset) is used.

  1. Also sometimes different models may require different confidence thresholds, try to use -thresh 0.2 or 0.15