Open AlexeyAB opened 3 years ago
so, when i train in these models, should the input image size be large scale (ex 896x896 or 1280x1280) ? Or these models can be resize the input image to the large input size even if my data size 640x640. because i used custom data which is 640x640 size.
It will resize your images automatically
If your training and test images are 640x640, then you can set width=640 height=640
in cfg file and use it for training and detection, so it will be much faster (and you can use lower subdivisions for training - accuracy will be higher)
You can try to use Frozen cfg file https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4-p5-frozen.cfg with pre-trained weights https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p5.conv.232 in this case training will be 2x faster and it will require 3x less GPU-RAM (and in this case accuracy will be higher if you use small mini_batch_size 1 or 2, i.e. batch=64 and subdivisions=64 or 32)
- It will resize your images automatically
- If your training and test images are 640x640, then you can set
width=640 height=640
in cfg file and use it for training and detection, so it will be much faster (and you can use lower subdivisions for training - accuracy will be higher)- You can try to use Frozen cfg file https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4-p5-frozen.cfg with pre-trained weights https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p5.conv.232 in this case training will be 2x faster and it will require 3x less GPU-RAM (and in this case accuracy will be higher if you use small mini_batch_size 1 or 2, i.e. batch=64 and subdivisions=64 or 32)
thanks for your quick answer.
What is difference between the yolov4-p5.cfg and the yolov4-p5-frozen.cfg in cfg. Is the difference only using pre-train weights when train starts?
Because i should change anchors and class to my data case in cfg
And when i train or test yolov4-p5 or p6, should i use the latest darknet? now, i have been using darknet-64efa721ede91cd8ccc18257f98eeba43b73a6af (for cuda 10.0)
thank you
You can train yolov4-p5.cfg
with or without pre-trained weights
You must train yolov4-p5-frozen.cfg
only with pre-trained weights (you can set higher mini-batch size, and training will be several times faster)
yolov4-p5-frozen.cfg
contains stopbackward=1
line, it freezes all previous layers: https://github.com/AlexeyAB/darknet/blob/86ced7151a71c05fab57bc14f77d6a4bb97b9ee6/cfg/yolov4-p5-frozen.cfg#L1691-L1698
thank you for your answer
so, i have one more question.
- It will resize your images automatically
- If your training and test images are 640x640, then you can set
width=640 height=640
in cfg file and use it for training and detection, so it will be much faster (and you can use lower subdivisions for training - accuracy will be higher)- You can try to use Frozen cfg file https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4-p5-frozen.cfg with pre-trained weights https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p5.conv.232 in this case training will be 2x faster and it will require 3x less GPU-RAM (and in this case accuracy will be higher if you use small mini_batch_size 1 or 2, i.e. batch=64 and subdivisions=64 or 32)
thank you for your answer
so, i have more question.
When my custom data size is 640 x 640, it will be resized automatically such as p5 --> 896 x 896 or p6 --> 1280 x 1280 which are input image size in your answer. So what is the purpose of the 'width' and 'height' of the CFG file(YOLOV4-p5, YOLOv4-p6) in training or test
Because if input size will be resized automatically, are YOLOv4.cfg or YOLOV4-csp.cfg also resize 608 x 608(input image size in paper)?
thank you
For Detection - use the same as usual:
You can download pre-trained weights on COCO:
yolov4-p5.cfg
https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p5.weightsyolov4-p6.cfg
https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p6.weightsFor Training - change these lines before each of 3 for p5 (of 4 for p6)
[yolo]
-layers: https://github.com/AlexeyAB/darknet/blob/9a86fce494b1d82b774d36be76747fcb58f81aa4/cfg/yolov4-p5.cfg#L1810-L1811filters=<(5 + num_classes) x 4>
activation=logistic
- for training and detection by using Darknet: https://github.com/AlexeyAB/darknetactivation=linear
- for training and detection by using Pytorch Scaled-YOLOv4 (CSP-branch): https://github.com/WongKinYiu/ScaledYOLOv4/tree/yolov4-cspFor training use pre-trained weights:
yolov4-p5.cfg
https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p5.conv.232yolov4-p6.cfg
https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p6.conv.289Currently Pytorch is more suitable for training on multiple-GPUs.
Accuracy - Speed:
Scaled-YOLOv4-P6 is slower, but +2% more accurate than YOLOR-P6 that is the best in terms of speed/accuracy for Waymo autonomous driving challenge: https://github.com/AlexeyAB/darknet/issues/7828
Speed and accuracy on COCO is validated by using Pytorch: https://github.com/WongKinYiu/ScaledYOLOv4/tree/yolov4-csp![112776361-281d8380-9048-11eb-8083-8728b12dcd55 (1)](https://user-images.githubusercontent.com/4096485/123530761-ba482d00-d706-11eb-858b-2c64e04df78b.png)
CLICK ME - Validation on Pytorch (COCO2017 test-dev):
**YOLOv4-P5:** ``` overall performance Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.51789 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.70340 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.56663 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.35761 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.56555 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.64820 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.38684 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.64156 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.69884 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.54954 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.74445 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.82892 ``` **YOLOv4-P6:** ``` overall performance Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.54413 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.72666 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.59473 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.39403 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.58871 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.67314 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.39711 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.66439 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.72261 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.59167 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.76267 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.84392 ```CLICK ME - Validation on Darknet (COCO2017 test-dev):
**YOLOv4-P5:** ``` overall performance Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.516 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.700 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.564 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.335 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.548 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.632 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.385 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.635 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.679 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.494 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.719 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.819 ``` **YOLOv4-P6:** ``` overall performance Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.540 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.721 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.590 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.365 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.573 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.654 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.396 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.657 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.703 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.531 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.736 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.837 ``` **YOLOv4x-mish-640:** ``` overall performance Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.501 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.685 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.545 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.315 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.540 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.623 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.380 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.622 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.663 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.465 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.706 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.817 ``` **YOLOv4csp-640:** ``` overall performance Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.487 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.674 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.530 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.300 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.527 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.609 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.372 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.607 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.648 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.445 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.695 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.803 ```