ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
51.08k stars 16.42k forks source link

Quantized model INT8 is not able to do inference or convert to any other type of model structure #9979

Closed Sanath1998 closed 1 year ago

Sanath1998 commented 2 years ago

Search before asking

Question

Hi @glenn-jocher

As I have quantized FLOAT32 model to INT8 model, I'am not able to convert the model to any other formats nor able to inference using detect.py and val.py

Error msg:

  1. While doing inference

    ckpt = (ckpt.get('ema') or ckpt['model']).to(device).float() # FP32 model AttributeError: 'collections.OrderedDict' object has no attribute 'to'

  2. While converting to any other model type using export.py

    [TRT] [E] ModelImporter.cpp:779: ERROR: images:232 In function importInput: [8] Assertion failed: convertDtype(onnxDtype.elem_type(), &trtDtype) && "Failed to convert ONNX date type to TensorRT data type."

@glenn-jocher Can you plz look onto this? Looking forward for your reply

Additional

No response

glenn-jocher commented 2 years ago

👋 Hello! Thanks for asking about YOLOv5 🚀 benchmarks. YOLOv5 inference is officially supported in 11 formats, and all formats are benchmarked for identical accuracy and to compare speed every 24 hours by the YOLOv5 CI.

💡 ProTip: Export to ONNX or OpenVINO for up to 3x CPU speedup. See CPU Benchmarks. 💡 ProTip: Export to TensorRT for up to 5x GPU speedup. See GPU Benchmarks.

Format export.py --include Model
PyTorch - yolov5s.pt
TorchScript torchscript yolov5s.torchscript
ONNX onnx yolov5s.onnx
OpenVINO openvino yolov5s_openvino_model/
TensorRT engine yolov5s.engine
CoreML coreml yolov5s.mlmodel
TensorFlow SavedModel saved_model yolov5s_saved_model/
TensorFlow GraphDef pb yolov5s.pb
TensorFlow Lite tflite yolov5s.tflite
TensorFlow Edge TPU edgetpu yolov5s_edgetpu.tflite
TensorFlow.js tfjs yolov5s_web_model/

Benchmarks

Benchmarks below run on a Colab Pro with the YOLOv5 tutorial notebook Open In Colab. To reproduce:

python utils/benchmarks.py --weights yolov5s.pt --imgsz 640 --device 0

Colab Pro V100 GPU

benchmarks: weights=/content/yolov5/yolov5s.pt, imgsz=640, batch_size=1, data=/content/yolov5/data/coco128.yaml, device=0, half=False, test=False
Checking setup...
YOLOv5 🚀 v6.1-135-g7926afc torch 1.10.0+cu111 CUDA:0 (Tesla V100-SXM2-16GB, 16160MiB)
Setup complete ✅ (8 CPUs, 51.0 GB RAM, 46.7/166.8 GB disk)

Benchmarks complete (458.07s)
                   Format  mAP@0.5:0.95  Inference time (ms)
0                 PyTorch        0.4623                10.19
1             TorchScript        0.4623                 6.85
2                    ONNX        0.4623                14.63
3                OpenVINO           NaN                  NaN
4                TensorRT        0.4617                 1.89
5                  CoreML           NaN                  NaN
6   TensorFlow SavedModel        0.4623                21.28
7     TensorFlow GraphDef        0.4623                21.22
8         TensorFlow Lite           NaN                  NaN
9     TensorFlow Edge TPU           NaN                  NaN
10          TensorFlow.js           NaN                  NaN

Colab Pro CPU

benchmarks: weights=/content/yolov5/yolov5s.pt, imgsz=640, batch_size=1, data=/content/yolov5/data/coco128.yaml, device=cpu, half=False, test=False
Checking setup...
YOLOv5 🚀 v6.1-135-g7926afc torch 1.10.0+cu111 CPU
Setup complete ✅ (8 CPUs, 51.0 GB RAM, 41.5/166.8 GB disk)

Benchmarks complete (241.20s)
                   Format  mAP@0.5:0.95  Inference time (ms)
0                 PyTorch        0.4623               127.61
1             TorchScript        0.4623               131.23
2                    ONNX        0.4623                69.34
3                OpenVINO        0.4623                66.52
4                TensorRT           NaN                  NaN
5                  CoreML           NaN                  NaN
6   TensorFlow SavedModel        0.4623               123.79
7     TensorFlow GraphDef        0.4623               121.57
8         TensorFlow Lite        0.4623               316.61
9     TensorFlow Edge TPU           NaN                  NaN
10          TensorFlow.js           NaN                  NaN

Good luck 🍀 and let us know if you have any other questions!

Sanath1998 commented 2 years ago

@glenn-jocher , my intent was to take a call on know-hows of converting model to quantized INT8 precison and doing inference on the INT8 precision mode? Could you please explain me on this point

glenn-jocher commented 2 years ago

@Sanath1998 depends on the export format. Some are already doing this, i.e. python export.py --tflite --int8

Sanath1998 commented 2 years ago

@glenn-jocher Can't we do quantization for other models which is exported into onnx, tensorrt etc?

glenn-jocher commented 2 years ago

Many formats support FP16 with the --half flag

Sanath1998 commented 2 years ago

Many formats support FP16 with the --half flag

@glenn-jocher actually the models before using --half flag and after using --half flag, the size of the model is same for both. Actually once we reduce to fp16, the model size should decrease ryt? Can u plz explain me the context on this

glenn-jocher commented 2 years ago

👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀.

Not all formats support --half and --int8. See export.py code for details.

We've created a few short guidelines below to help users provide what we need in order to start investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

For Ultralytics to provide assistance your code should also be:

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! 😃

github-actions[bot] commented 1 year ago

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!