AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.57k stars 7.95k forks source link

Added YOLOv4-P5 (896x896) and YOLOv4-P6 (1280x1280) Scaled-YOLOv4-models #7838

Open AlexeyAB opened 3 years ago

AlexeyAB commented 3 years ago

For Detection - use the same as usual:

./darknet detector test cfg/coco.data cfg/yolov4-p5.cfg yolov4-p5.weights -ext_output dog.jpg
./darknet detector test cfg/coco.data cfg/yolov4-p6.cfg yolov4-p6.weights -ext_output dog.jpg

You can download pre-trained weights on COCO:


For Training - change these lines before each of 3 for p5 (of 4 for p6) [yolo]-layers: https://github.com/AlexeyAB/darknet/blob/9a86fce494b1d82b774d36be76747fcb58f81aa4/cfg/yolov4-p5.cfg#L1810-L1811 filters=<(5 + num_classes) x 4> activation=logistic - for training and detection by using Darknet: https://github.com/AlexeyAB/darknet activation=linear - for training and detection by using Pytorch Scaled-YOLOv4 (CSP-branch): https://github.com/WongKinYiu/ScaledYOLOv4/tree/yolov4-csp

For training use pre-trained weights:

Currently Pytorch is more suitable for training on multiple-GPUs.


Accuracy - Speed:

Scaled-YOLOv4-P6 is slower, but +2% more accurate than YOLOR-P6 that is the best in terms of speed/accuracy for Waymo autonomous driving challenge: https://github.com/AlexeyAB/darknet/issues/7828

123036148-3e43a180-d3f5-11eb-926d-bbc810f0ea6a

Speed and accuracy on COCO is validated by using Pytorch: https://github.com/WongKinYiu/ScaledYOLOv4/tree/yolov4-csp 112776361-281d8380-9048-11eb-8083-8728b12dcd55 (1)


CLICK ME - Validation on Pytorch (COCO2017 test-dev): **YOLOv4-P5:** ``` overall performance Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.51789 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.70340 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.56663 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.35761 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.56555 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.64820 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.38684 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.64156 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.69884 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.54954 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.74445 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.82892 ``` **YOLOv4-P6:** ``` overall performance Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.54413 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.72666 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.59473 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.39403 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.58871 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.67314 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.39711 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.66439 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.72261 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.59167 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.76267 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.84392 ```
CLICK ME - Validation on Darknet (COCO2017 test-dev): **YOLOv4-P5:** ``` overall performance Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.516 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.700 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.564 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.335 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.548 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.632 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.385 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.635 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.679 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.494 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.719 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.819 ``` **YOLOv4-P6:** ``` overall performance Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.540 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.721 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.590 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.365 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.573 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.654 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.396 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.657 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.703 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.531 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.736 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.837 ``` **YOLOv4x-mish-640:** ``` overall performance Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.501 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.685 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.545 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.315 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.540 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.623 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.380 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.622 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.663 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.465 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.706 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.817 ``` **YOLOv4csp-640:** ``` overall performance Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.487 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.674 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.530 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.300 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.527 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.609 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.372 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.607 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.648 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.445 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.695 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.803 ```
namasang1 commented 3 years ago

so, when i train in these models, should the input image size be large scale (ex 896x896 or 1280x1280) ? Or these models can be resize the input image to the large input size even if my data size 640x640. because i used custom data which is 640x640 size.

AlexeyAB commented 3 years ago
namasang1 commented 3 years ago

thanks for your quick answer.

What is difference between the yolov4-p5.cfg and the yolov4-p5-frozen.cfg in cfg. Is the difference only using pre-train weights when train starts?

Because i should change anchors and class to my data case in cfg

And when i train or test yolov4-p5 or p6, should i use the latest darknet? now, i have been using darknet-64efa721ede91cd8ccc18257f98eeba43b73a6af (for cuda 10.0)

thank you

AlexeyAB commented 3 years ago

yolov4-p5-frozen.cfg contains stopbackward=1 line, it freezes all previous layers: https://github.com/AlexeyAB/darknet/blob/86ced7151a71c05fab57bc14f77d6a4bb97b9ee6/cfg/yolov4-p5-frozen.cfg#L1691-L1698

namasang1 commented 3 years ago

thank you for your answer

so, i have one more question.

thank you for your answer

so, i have more question.

When my custom data size is 640 x 640, it will be resized automatically such as p5 --> 896 x 896 or p6 --> 1280 x 1280 which are input image size in your answer. So what is the purpose of the 'width' and 'height' of the CFG file(YOLOV4-p5, YOLOv4-p6) in training or test

Because if input size will be resized automatically, are YOLOv4.cfg or YOLOV4-csp.cfg also resize 608 x 608(input image size in paper)?

thank you