Closed saumitrabg closed 3 years ago
We of course want the best accuracy at the best FPS but after some point, a model becomes overkill -is there a way to quantify that?
@saumitrabg this is quantified quite well in the README I believe. This is the raison d'etre of the main README chart. You would select the model that meets your constraints best with the least compromises on the design factors that are a priority to you (size, speed, accuracy).
** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS. EfficientDet data from google/automl at batch size 8.
Model | size | APval | APtest | AP50 | SpeedV100 | FPSV100 | params | GFLOPS | |
---|---|---|---|---|---|---|---|---|---|
YOLOv5s | 640 | 36.8 | 36.8 | 55.6 | 2.2ms | 455 | 7.3M | 17.0 | |
YOLOv5m | 640 | 44.5 | 44.5 | 63.1 | 2.9ms | 345 | 21.4M | 51.3 | |
YOLOv5l | 640 | 48.1 | 48.1 | 66.4 | 3.8ms | 264 | 47.0M | 115.4 | |
YOLOv5x | 640 | 50.1 | 50.1 | 68.7 | 6.0ms | 167 | 87.7M | 218.8 | |
YOLOv5x + TTA | 832 | 51.9 | 51.9 | 69.6 | 24.9ms | 40 | 87.7M | 1005.3 |
APtest denotes COCO test-dev2017 server results, all other AP results denote val2017 accuracy.
All AP numbers are for single-model single-scale without ensemble or TTA. Reproduce mAP by python test.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65
SpeedGPU averaged over 5000 COCO val2017 images using a GCP n1-standard-16 V100 instance, and includes image preprocessing, FP16 inference, postprocessing and NMS. NMS is 1-2ms/img. Reproduce speed by python test.py --data coco.yaml --img 640 --conf 0.25 --iou 0.45
All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
Test Time Augmentation (TTA) runs at 3 image sizes. Reproduce TTA** by python test.py --data coco.yaml --img 832 --iou 0.65 --augment
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
❔Question
Is there a rule of thumb or guidance on selecting a YOLOv5 model? This will be a good addition to the existing document. https://github.com/ultralytics/yolov5#pretrained-checkpoints
We of course want the best accuracy at the best FPS but after some point, a model becomes overkill -is there a way to quantify that? E.g. my images are 2448x784 but we train with (img 1024, rect) and end up with 1024x328 or 335K pixel points x 6 classes. So, we go for Yolov5x or Yologv5m with different params? I know, a lot of these will depend on actual experimentation but wanted to get a feel for the theoretical standpoint.
Additional context