ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
51.2k stars 16.43k forks source link

Guidance on YOLO model selection #1965

Closed saumitrabg closed 3 years ago

saumitrabg commented 3 years ago

❔Question

Is there a rule of thumb or guidance on selecting a YOLOv5 model? This will be a good addition to the existing document. https://github.com/ultralytics/yolov5#pretrained-checkpoints

We of course want the best accuracy at the best FPS but after some point, a model becomes overkill -is there a way to quantify that? E.g. my images are 2448x784 but we train with (img 1024, rect) and end up with 1024x328 or 335K pixel points x 6 classes. So, we go for Yolov5x or Yologv5m with different params? I know, a lot of these will depend on actual experimentation but wanted to get a feel for the theoretical standpoint.

Additional context

glenn-jocher commented 3 years ago

We of course want the best accuracy at the best FPS but after some point, a model becomes overkill -is there a way to quantify that?

@saumitrabg this is quantified quite well in the README I believe. This is the raison d'etre of the main README chart. You would select the model that meets your constraints best with the least compromises on the design factors that are a priority to you (size, speed, accuracy).

** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS. EfficientDet data from google/automl at batch size 8.

Model size APval APtest AP50 SpeedV100 FPSV100 params GFLOPS
YOLOv5s 640 36.8 36.8 55.6 2.2ms 455 7.3M 17.0
YOLOv5m 640 44.5 44.5 63.1 2.9ms 345 21.4M 51.3
YOLOv5l 640 48.1 48.1 66.4 3.8ms 264 47.0M 115.4
YOLOv5x 640 50.1 50.1 68.7 6.0ms 167 87.7M 218.8
YOLOv5x + TTA 832 51.9 51.9 69.6 24.9ms 40 87.7M 1005.3

APtest denotes COCO test-dev2017 server results, all other AP results denote val2017 accuracy.
All AP numbers are for single-model single-scale without ensemble or TTA. Reproduce mAP by python test.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65
SpeedGPU averaged over 5000 COCO val2017 images using a GCP n1-standard-16 V100 instance, and includes image preprocessing, FP16 inference, postprocessing and NMS. NMS is 1-2ms/img. Reproduce speed by python test.py --data coco.yaml --img 640 --conf 0.25 --iou 0.45
All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation). Test Time Augmentation (TTA) runs at 3 image sizes. Reproduce TTA** by python test.py --data coco.yaml --img 832 --iou 0.65 --augment

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.