Closed Sanath1998 closed 1 year ago
👋 Hello! Thanks for asking about YOLOv5 🚀 benchmarks. YOLOv5 inference is officially supported in 11 formats, and all formats are benchmarked for identical accuracy and to compare speed every 24 hours by the YOLOv5 CI.
💡 ProTip: Export to ONNX or OpenVINO for up to 3x CPU speedup. See CPU Benchmarks. 💡 ProTip: Export to TensorRT for up to 5x GPU speedup. See GPU Benchmarks.
Format | export.py --include |
Model |
---|---|---|
PyTorch | - | yolov5s.pt |
TorchScript | torchscript |
yolov5s.torchscript |
ONNX | onnx |
yolov5s.onnx |
OpenVINO | openvino |
yolov5s_openvino_model/ |
TensorRT | engine |
yolov5s.engine |
CoreML | coreml |
yolov5s.mlmodel |
TensorFlow SavedModel | saved_model |
yolov5s_saved_model/ |
TensorFlow GraphDef | pb |
yolov5s.pb |
TensorFlow Lite | tflite |
yolov5s.tflite |
TensorFlow Edge TPU | edgetpu |
yolov5s_edgetpu.tflite |
TensorFlow.js | tfjs |
yolov5s_web_model/ |
Benchmarks below run on a Colab Pro with the YOLOv5 tutorial notebook . To reproduce:
python utils/benchmarks.py --weights yolov5s.pt --imgsz 640 --device 0
benchmarks: weights=/content/yolov5/yolov5s.pt, imgsz=640, batch_size=1, data=/content/yolov5/data/coco128.yaml, device=0, half=False, test=False
Checking setup...
YOLOv5 🚀 v6.1-135-g7926afc torch 1.10.0+cu111 CUDA:0 (Tesla V100-SXM2-16GB, 16160MiB)
Setup complete ✅ (8 CPUs, 51.0 GB RAM, 46.7/166.8 GB disk)
Benchmarks complete (458.07s)
Format mAP@0.5:0.95 Inference time (ms)
0 PyTorch 0.4623 10.19
1 TorchScript 0.4623 6.85
2 ONNX 0.4623 14.63
3 OpenVINO NaN NaN
4 TensorRT 0.4617 1.89
5 CoreML NaN NaN
6 TensorFlow SavedModel 0.4623 21.28
7 TensorFlow GraphDef 0.4623 21.22
8 TensorFlow Lite NaN NaN
9 TensorFlow Edge TPU NaN NaN
10 TensorFlow.js NaN NaN
benchmarks: weights=/content/yolov5/yolov5s.pt, imgsz=640, batch_size=1, data=/content/yolov5/data/coco128.yaml, device=cpu, half=False, test=False
Checking setup...
YOLOv5 🚀 v6.1-135-g7926afc torch 1.10.0+cu111 CPU
Setup complete ✅ (8 CPUs, 51.0 GB RAM, 41.5/166.8 GB disk)
Benchmarks complete (241.20s)
Format mAP@0.5:0.95 Inference time (ms)
0 PyTorch 0.4623 127.61
1 TorchScript 0.4623 131.23
2 ONNX 0.4623 69.34
3 OpenVINO 0.4623 66.52
4 TensorRT NaN NaN
5 CoreML NaN NaN
6 TensorFlow SavedModel 0.4623 123.79
7 TensorFlow GraphDef 0.4623 121.57
8 TensorFlow Lite 0.4623 316.61
9 TensorFlow Edge TPU NaN NaN
10 TensorFlow.js NaN NaN
Good luck 🍀 and let us know if you have any other questions!
@glenn-jocher , my intent was to take a call on know-hows of converting model to quantized INT8 precison and doing inference on the INT8 precision mode? Could you please explain me on this point
@Sanath1998 depends on the export format. Some are already doing this, i.e. python export.py --tflite --int8
@glenn-jocher Can't we do quantization for other models which is exported into onnx, tensorrt etc?
Many formats support FP16 with the --half flag
Many formats support FP16 with the --half flag
@glenn-jocher actually the models before using --half flag and after using --half flag, the size of the model is same for both. Actually once we reduce to fp16, the model size should decrease ryt? Can u plz explain me the context on this
👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀.
Not all formats support --half and --int8. See export.py code for details.
We've created a few short guidelines below to help users provide what we need in order to start investigating a possible problem.
When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:
For Ultralytics to provide assistance your code should also be:
git pull
or git clone
a new copy to ensure your problem has not already been solved in master.If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.
Thank you! 😃
👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.
Access additional YOLOv5 🚀 resources:
Access additional Ultralytics ⚡ resources:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!
Search before asking
Question
Hi @glenn-jocher
As I have quantized FLOAT32 model to INT8 model, I'am not able to convert the model to any other formats nor able to inference using detect.py and val.py
Error msg:
While doing inference
ckpt = (ckpt.get('ema') or ckpt['model']).to(device).float() # FP32 model AttributeError: 'collections.OrderedDict' object has no attribute 'to'
While converting to any other model type using export.py
[TRT] [E] ModelImporter.cpp:779: ERROR: images:232 In function importInput: [8] Assertion failed: convertDtype(onnxDtype.elem_type(), &trtDtype) && "Failed to convert ONNX date type to TensorRT data type."
@glenn-jocher Can you plz look onto this? Looking forward for your reply
Additional
No response