Slower Training Than Ultralytics

open-mmlab / mmyolo

OpenMMLab YOLO series toolbox and benchmark. Implemented RTMDet, RTMDet-Rotated,YOLOv5, YOLOv6, YOLOv7, YOLOv8,YOLOX, PPYOLOE, etc.

https://mmyolo.readthedocs.io/zh_CN/dev/

GNU General Public License v3.0

2.84k stars 523 forks source link

Slower Training Than Ultralytics #964

Open davidhuangal opened 5 months ago

davidhuangal commented 5 months ago

I have noticed that training is significantly slower with MMYOLO as opposed to Ultralytics using the same parameters and environment. I.e., using the same set of GPUs with the same batch size, both using AMP, both using distributed training, etc.

By significantly, I mean in the range of 3x-4x. Has anyone else run into this issue or figured out how to fix it? I have even tried using the cached mosaic augmentation and even simply removing the mosaic augmentation as the FAQ mentioned this could be a bottleneck and saw no significant increase in training speed.

lianxintao commented 5 months ago

yes，i find the same problem

PushpakBhoge commented 2 months ago

@davidhuangal I think it is kind of trade off. the ultralytics implementation is very tighthley written there code is very hard to read and if you want to modify it good luck! point is that it's not a very modular, readable and in results not very customizable friendly. here it is very much customizable I used yolov5 as RPN in mask-RCNN from mmdet then also Resnet as backbone all those things are possible and we loose speed for flexibility, it's trade off.

it can be improved though but seems like this project is abandoned, there are no commits in main branch since 9 month.