Open AlexeyAB opened 4 years ago
Just downloaded and installed YOLO v4 and compared it to YOLO v3 running OpenCV 4.4 on a Linux desktop on a standard set of images of crowds of people. The new version is far worse than v3 at getting an accurate people count. In one image with where v3 found 60 people, v4 finds only 17 and in almost all cases the "probability" numbers are lower for the v4 detections. The source code, images, blob sizes are all identical for both runs.
Is anyone else noticing this issue?
Links, examples and additional information about Yolo v4: https://medium.com/@alexeyab84/yolov4-the-most-accurate-real-time-neural-network-on-ms-coco-dataset-73adfd3602fe?source=friends_link&sk=6039748846bbcf1d960c3061542591d7
Discussion: reddit
YOLOv4 (608x608) - 62 FPS V100 - 43.5% AP - 65.7% AP50 on MSCOCO testdev
62 FPS — YOLOv4 (608x608 batch=1) on Tesla V100 — by using Darknet-framework
400 FPS — YOLOv4 (416x416 batch=4) on RTX 2080 Ti — by using TensorRT+tkDNN
32 FPS — YOLOv4 (416x416 batch=1) on Jetson AGX Xavier — by using TensorRT+tkDNN
11 FPS — YOLOv4 (256x256 async=3, leaky instead of mish) on 1 Watt neurochip Intel Myriad X by using OpenCV(witrh OpenVINO IE backend) with accuracy 33.3% AP — 53.0% AP50 (while YOLOv3 416x416 — 31.0% AP — 55.3% AP50)