Closed WongKinYiu closed 4 years ago
@WongKinYiu Hi,
It obviously CSPDarknet53s-PASPP-Mish: (~YOLOv4)
is much better than amusi YOLOv5l (640x640)
(batch-size 16):
CSPDarknet53s-PASPP-Mish: (~YOLOv4) 512x512/608x608
: - 45.0% AP - Speed: 8.7/1.6/10.3 msYOLOv5l (640x640)/(736x736)
: - 44.2% AP - Speed: 11.3/2.2/13.5 msWhile our new YOLOv4 model is even much better:
CSPDarknet53s-PACSP
: 45.1% AP - Speed: 6.6/1.5/8.1 msamusi YOLOv5l (640x640)
?Train with YOLOv5 setting (640x640)
trained on coco 2017 train set and tested on coco 2017 5k set.
YOLOv3-SPP:
yolov3-spp 45.5% AP Model Summary: 225 layers, 6.29987e+07 parameters, 6.29987e+07 gradients Speed: 10.4/2.1/12.6 ms inference/NMS/total per 736x736 image at batch-size 16
@AlexeyAB
- Does it use inference time data augmetation?
No, there is no any inference time augmentation.
- Why is batch 16 used here?
I just follow Ultralytics testing protocol with batch size 16.
- Is there GitHub-repo with
amusi YOLOv5l (640x640)
?
It is not amusi's repo, it is Ultralytics's new repo.
- Is better AP for Yolov3-spp achieved just by using 640x640 network resolution, or something else?
There are some modifications in Ultralytics's new repo.
But yes I think main reason of improvement is from 640x640 training.
And In Ultralytics's new repo, it seems use affine transform instead of multi-resolution training.
So new training won't use too much GPU ram. (need to check code in details.) training log details
I am training CSPDarknet53-PACSP-(SAM)-Mish
with darknet
on MSCOCO 2017.
And In Ultralytics's new repo, it seems use affine transform instead of multi-resolution training.
Yes:
scale=0.5
https://github.com/ultralytics/yolov5/blob/391492ee5b56ef36424b4a9257c18f7c784a8f44/train.py#L44python train.py --data coco.yaml --cfg yolov5s.yaml --weights '' --batch-size 16
May be we should use random=0 resize=1.5
instead of random=1
too in the Darknet?
@AlexeyAB
OK, will train this setting on tiny-yolov4
with width=640
and height=640
.
If this can work good, users can use cheaper gpu to train yolo.
@AlexeyAB Hello,
Yes, the AP is benefit by 640x640 training. CSPDarknet53s-YOSPP gets 12.5% faster model inference speed and 0.1% higher AP than YOLOv3-SPP. CSPDarknet53s-YOSPP gets 19.5% faster model inference speed and 1.3% higher AP than YOLOv5l.
YOLOv3-SPP:
yolov3-spp: 45.5% AP @736x736
Model Summary: 225 layers, 6.29987e+07 parameters, 6.29987e+07 gradients
Speed: 10.4/2.1/12.6 ms inference/NMS/total per 736x736 image at batch-size 16
CSPDarknet53s-YOSPP: (~YOLOv4(Leaky) backbone + YOLOv3 head)
cd53s-yospp: 45.6% AP @736x736
Model Summary: 225 layers, 4.90092e+07 parameters, 4.90092e+07 gradients
Speed: 9.1/2.0/11.1 ms inference/NMS/total per 736x736 image at batch-size 16
YOLOv5l:
yolov5l 44.2% AP @ 736x736
Model Summary: 231 layers, 6.17556e+07 parameters, 6.17556e+07 gradients
Speed: 11.3/2.2/13.5 ms inference/NMS/total per 736x736 image at batch-size 16
@WongKinYiu Nice.
s
give improvements for training on both Ultralitics and Darknet?@AlexeyAB
- Does CSPDarknet53s
s
give improvements for training on both Ultralitics and Darknet?
I am not sure for Darknet due to I do not train it on ImageNet, but yes for Ultralytics.
- Interesting, what AP will give P6-model that is trained on 640x640 and tested on 736x736?
To acheive this goal I have to take a look how to construct P6 model using new Ultralytics repository. Then I need construct the YOLOv4 model, it does not support all of blocks of YOLOv4 currently. (or maybe directly modify my current used pytorch code) I think I will design training scheme to train P6 model on Darknet first.
@WongKinYiu Hi,
Can you share cfg/weights files for this model?
CSPDarknet53s-PASPP-Mish: (~YOLOv4) - trained 512x512, tested 608x608
cd53s-paspp-mish 45.0% AP @ 608x608 Model Summary: 212 layers, 6.43092e+07 parameters, 6.43092e+07 gradients Speed: 8.7/1.6/10.3 ms inference/NMS/total per 608x608 image at batch-size 16
@AlexeyAB
Hi WongKinYiu, what does -PACSP mean ? And I can't find config and weight file of it, thanks a lot !
Hello, PACSP means apply CSP to PANet, the model is still in training process, will release .weights
file after finish training.
@amusi Hello,
I saw your article, here I provide some comparison of Pytorch version YOLOv3, YOLOv4, and YOLOv5. (All experiments are run on a same Tesla V100 GPU)
Pytorch version
Train with YOLOv3 setting (416x416)
trained on coco 2014 trainvalno5k set and tested on coco 2014 5k set.
YOLOv3-SPP:
Train with YOLOv4 setting (512x512)
trained on coco 2014 trainvalno5k set and tested on coco 2014 5k set.
YOLOv3-SPP:
CSPDarknet53s-YOSPP: (~YOLOv4(Leaky) backbone + YOLOv3 head)
CSPDarknet53s-YOSPP-Mish: (~YOLOv4 backbone + YOLOv3 head)
CSPDarknet53s-PASPP: (~YOLOv4(Leaky))
CSPDarknet53s-PASPP-Mish: (~YOLOv4)
CSPDarknet53s-PACSP:
Train with YOLOv5 setting (640x640)
trained on coco 2017 train set and tested on coco 2017 5k set.
YOLOv3-SPP:
YOLOv5s:
YOLOv5m:
YOLOv5l:
YOLOv5x: