THU-MIG / yolov10

YOLOv10: Real-Time End-to-End Object Detection
https://arxiv.org/abs/2405.14458
GNU Affero General Public License v3.0
7.95k stars 644 forks source link

How to Convert YOLOv10 Model to TFLite with INT8 Quantization? #176

Open AhmedFkih opened 4 weeks ago

AhmedFkih commented 4 weeks ago

Hi everyone,

I’m working on a project that involves deploying a YOLOv10 model on a mobile/edge device. To improve inference speed and reduce the model size, I want to convert my YOLOv10 model to TensorFlow Lite (TFLite) with INT8 quantization.

-What is the best approach to convert the YOLOv10 model to TFLite with INT8 quantization? -Are there any specific calibration techniques required for YOLOv10 during quantization? -Can someone provide a code example or point me to resources for performing INT8 quantization on YOLOv10?

nenenkosi commented 4 weeks ago

+1

senceryucel commented 4 weeks ago

Hi @AhmedFkih @nenenkosi

You can use the following code that I used for YOLOX more than a year ago. I can't remember the versions of the packages, but I think you can easily solve incompatibility issues.

Do not forget that you need to convert the YOLOv10 model from pt to onnx first (check this).

Please be aware that yo may need to change the parameters of the quantize method (quantize_static), since my requirements were probably way different than yours. You should also check for quantize_dynamic method which you can import from onnxruntime.quantization as well as quantize_static.

import numpy as np
from onnxruntime.quantization import quantize_static, CalibrationMethod, CalibrationDataReader, QuantType, QuantFormat

# loading the ONNX model
onnx_model_input_path = "YOLOv10.onnx"
onnx_model_output_path = "output.onnx"

# calibration dataset (dummy data for calibration)
class DummyDataReader(CalibrationDataReader):
    def __init__(self, num_samples):
        self.num_samples = num_samples
        self.current_sample = 0

    def get_next(self):
        if self.current_sample < self.num_samples:
            input_data = self.generate_random_input()
            self.current_sample += 1
            return {'images': input_data}
        else:
            return None

    def generate_random_input(self):
        input_data = np.random.uniform(-1, 1, size=input_shape).astype(np.float32)
        return input_data

num_calibration_samples = 100
input_shape = (1, 3, 640, 640)

calibration_data_reader = DummyDataReader(num_samples=num_calibration_samples)

# Quantize the model to int8
quantized_model = quantize_static(
    model_input=onnx_model_input_path,
    model_output=onnx_model_output_path,
    calibration_data_reader=calibration_data_reader,
    activation_type=QuantType.QInt8,
    weight_type=QuantType.QInt8,
    quant_format=QuantFormat.QDQ,
    per_channel=False,
    calibrate_method=CalibrationMethod.MinMax
)
rabion1234 commented 2 weeks ago

hey @senceryucel

i want my converted onnx model from yolov10 to tflite but your code will give onnx INT8 Quantization. i have converted my model to onnx but with tflite there is problem.

enzoferrari1 commented 1 week ago

Hi! Was this solved? I'm facing the same issue