Added YOLOv4-P5 (896x896) and YOLOv4-P6 (1280x1280) Scaled-YOLOv4-models

AlexeyAB commented 3 years ago

For Detection - use the same as usual:

./darknet detector test cfg/coco.data cfg/yolov4-p5.cfg yolov4-p5.weights -ext_output dog.jpg
./darknet detector test cfg/coco.data cfg/yolov4-p6.cfg yolov4-p6.weights -ext_output dog.jpg

You can download pre-trained weights on COCO:

for yolov4-p5.cfg https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p5.weights
for yolov4-p6.cfg https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p6.weights

For Training - change these lines before each of 3 for p5 (of 4 for p6) [yolo]-layers: https://github.com/AlexeyAB/darknet/blob/9a86fce494b1d82b774d36be76747fcb58f81aa4/cfg/yolov4-p5.cfg#L1810-L1811 filters=<(5 + num_classes) x 4> activation=logistic - for training and detection by using Darknet: https://github.com/AlexeyAB/darknet activation=linear - for training and detection by using Pytorch Scaled-YOLOv4 (CSP-branch): https://github.com/WongKinYiu/ScaledYOLOv4/tree/yolov4-csp

For training use pre-trained weights:

for yolov4-p5.cfg https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p5.conv.232
for yolov4-p6.cfg https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p6.conv.289

Currently Pytorch is more suitable for training on multiple-GPUs.

Accuracy - Speed:

Scaled-YOLOv4-P6 - 54.5% AP (COCO) - 30 FPS
YOLOR-P6 - 52.6% AP (COCO) - 49 FPS

Scaled-YOLOv4-P6 is slower, but +2% more accurate than YOLOR-P6 that is the best in terms of speed/accuracy for Waymo autonomous driving challenge: https://github.com/AlexeyAB/darknet/issues/7828

123036148-3e43a180-d3f5-11eb-926d-bbc810f0ea6a

Speed and accuracy on COCO is validated by using Pytorch: https://github.com/WongKinYiu/ScaledYOLOv4/tree/yolov4-csp 112776361-281d8380-9048-11eb-8083-8728b12dcd55 (1)

CLICK ME - Validation on Pytorch (COCO2017 test-dev):

CLICK ME - Validation on Darknet (COCO2017 test-dev):

namasang1 commented 3 years ago

so, when i train in these models, should the input image size be large scale (ex 896x896 or 1280x1280) ? Or these models can be resize the input image to the large input size even if my data size 640x640. because i used custom data which is 640x640 size.

AlexeyAB commented 3 years ago

It will resize your images automatically
If your training and test images are 640x640, then you can set width=640 height=640 in cfg file and use it for training and detection, so it will be much faster (and you can use lower subdivisions for training - accuracy will be higher)
You can try to use Frozen cfg file https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4-p5-frozen.cfg with pre-trained weights https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p5.conv.232 in this case training will be 2x faster and it will require 3x less GPU-RAM (and in this case accuracy will be higher if you use small mini_batch_size 1 or 2, i.e. batch=64 and subdivisions=64 or 32)

namasang1 commented 3 years ago

It will resize your images automatically

If your training and test images are 640x640, then you can set width=640 height=640 in cfg file and use it for training and detection, so it will be much faster (and you can use lower subdivisions for training - accuracy will be higher)

You can try to use Frozen cfg file https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4-p5-frozen.cfg with pre-trained weights https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p5.conv.232 in this case training will be 2x faster and it will require 3x less GPU-RAM (and in this case accuracy will be higher if you use small mini_batch_size 1 or 2, i.e. batch=64 and subdivisions=64 or 32)

thanks for your quick answer.

What is difference between the yolov4-p5.cfg and the yolov4-p5-frozen.cfg in cfg. Is the difference only using pre-train weights when train starts?

Because i should change anchors and class to my data case in cfg

And when i train or test yolov4-p5 or p6, should i use the latest darknet? now, i have been using darknet-64efa721ede91cd8ccc18257f98eeba43b73a6af (for cuda 10.0)

thank you

AlexeyAB commented 3 years ago

You can train yolov4-p5.cfg with or without pre-trained weights
You must train yolov4-p5-frozen.cfg only with pre-trained weights (you can set higher mini-batch size, and training will be several times faster)

yolov4-p5-frozen.cfg contains stopbackward=1 line, it freezes all previous layers: https://github.com/AlexeyAB/darknet/blob/86ced7151a71c05fab57bc14f77d6a4bb97b9ee6/cfg/yolov4-p5-frozen.cfg#L1691-L1698

namasang1 commented 3 years ago

thank you for your answer

so, i have one more question.

It will resize your images automatically

If your training and test images are 640x640, then you can set width=640 height=640 in cfg file and use it for training and detection, so it will be much faster (and you can use lower subdivisions for training - accuracy will be higher)

You can try to use Frozen cfg file https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4-p5-frozen.cfg with pre-trained weights https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p5.conv.232 in this case training will be 2x faster and it will require 3x less GPU-RAM (and in this case accuracy will be higher if you use small mini_batch_size 1 or 2, i.e. batch=64 and subdivisions=64 or 32)

thank you for your answer

so, i have more question.

When my custom data size is 640 x 640, it will be resized automatically such as p5 --> 896 x 896 or p6 --> 1280 x 1280 which are input image size in your answer. So what is the purpose of the 'width' and 'height' of the CFG file(YOLOV4-p5, YOLOv4-p6) in training or test

Because if input size will be resized automatically, are YOLOv4.cfg or YOLOV4-csp.cfg also resize 608 x 608(input image size in paper)?

thank you

AlexeyAB / darknet

Added YOLOv4-P5 (896x896) and YOLOv4-P6 (1280x1280) Scaled-YOLOv4-models #7838