ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.02k stars 16.17k forks source link

Significant Variations in Training Results with Same Dataset and Parameters #13341

Open timiil opened 3 days ago

timiil commented 3 days ago

Search before asking

Question

Hi everyone,

We’ve encountered a noticeable discrepancy in the performance metrics when training the same model (yolov8n.pt) on the same dataset but with different hardware and similar training parameters. The results, specifically the mAP (50-95), vary significantly across different setups.

Base Model: yolov8n.pt

Training Parameters:

No. Hardware Epochs Batch Size mAP (50-95)
1 A6000(48GB vram) 100 16 0.961
2 4090(24GB vram) 100 12 0.93
3 4090(24GB vram) 150 12 0.92
4 L20(48GB vram) 100 16 0.976

We’ve also tried enabling or disabling coslr, but it seems to have little to no effect on the outcome.

Could anyone shed light on what might be causing this inconsistency? Additionally, what strategies could we adopt to achieve better performance on more limited hardware setups?

Thank you in advance for your help!

Additional

No response

UltralyticsAssistant commented 3 days ago

👋 Hello @timiil, thank you for bringing this to our attention! 🚀 This is an automated response to help guide you, and one of our Ultralytics engineers will assist you soon.

Please ensure you are following our ⭐️ Tutorials for accurate setup, including checking out our Custom Data Training and Tips for Best Training Results.

Since you are experiencing variations in results, could you provide a minimum reproducible example to help us better understand and debug the issue? A consistent setup between different hardware is crucial, and there might be nuances that the example could highlight.

For verifying the setup, please ensure:

Requirements

Python>=3.8.0 with all requirements.txt installed, including PyTorch>=1.8. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 can be run in various environments with all dependencies preinstalled:

For more insights on your question, additional details like dataset image examples and training logs would be helpful.

We also invite you to explore our latest model - YOLOv8 🚀, which might offer enhanced performance for your tasks.

Thank you for your patience and contribution! 😊