quic / aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
https://quic.github.io/aimet-pages/index.html
Other
2.08k stars 374 forks source link

Quantify YOLOv5 error using AIMET #2665

Open Rax-Lie opened 7 months ago

Rax-Lie commented 7 months ago

Hi, Team, I have been trying to quantify the YOLOv5 (v6.0) model using Aimet (1.27.0) recently, and I need to obtain encodings file to proceed with the next step of work. I received an error message from the Custom Marker:

TypeError: 'CustomMarker' object is not iterable
2024-01-24 17:18:04,539 - Utils - WARNING - naming of onnx op at non-leaf modules failed, skipping naming of non-leaf
2024-01-24 17:18:04,616 - Utils - INFO - successfully created onnx model with 245/309 node names updated
2024-01-24 17:18:04,811 - Quant - INFO - Layers excluded from quantization: []

Thre full logfile: yolov5.log

Here is mycode:

import torch

import aimet_common
from aimet_common.defs import QuantScheme, QuantizationSimModel
from aimet_torch.onnx_utils import OnnxExportApiArgs

if __name__ == '__main__':
    model = torch.hub.load('ultra/yolov5', 'yolov5m', source='local', skip_validation=True)
    input_shape = (1, 3, 640, 640)
    dummy_input = torch.rand(input_shape).cuda()
    quantsim = QuantizationSimModel(model=model, dummy_input=dummy_input,
                                    quant_scheme=QuantScheme.post_training_tf_enhanced,
                                    config_file="./aimet_config.json",
                                    rounding_mode='nearest', default_output_bw=8, default_param_bw=8,
                                    in_place=False)
    quantsim.export(path="results", onnx_export_args=OnnxExportApiArgs(opset_version=11, input_names=['input'], output_names=['output']), filename_prefix='yolov5', dummy_input=dummy_input.cpu())

I followed the suggestion of #1067 but only obtained the converted ONNX model, but the corresponding encodings were empty. As follows:

yolov5.encodings

{
    "activation_encodings": {},
    "excluded_layers": [],
    "param_encodings": {},
    "quantizer_args": {
        "activation_bitwidth": 8,
        "dtype": "int",
        "is_symmetric": true,
        "param_bitwidth": 8,
        "per_channel_quantization": true,
        "quant_scheme": "post_training_tf_enhanced"
    },
    "version": "0.6.1"
}

And my env as follows:

AimetCommon               torch-gpu-1.27.0
AimetTorch                torch-gpu-1.27.0
torch                     1.9.1+cu111
torchvision               0.10.1+cu111
onnx                      1.11.0
onnxruntime               1.11.0
quic-hitameht commented 7 months ago

Hi @Rax-Lie

Thanks for reaching out. There are two issues here:

1) Encodings JSON file is empty in your example because you have not performed "calibration" step and AIMET refers to it as compute_encodings. After creating QuantizationSimModel object, quantsim.model is still not ready to use yet. It only has added quantizer nodes to the model graph. To compute quantization parameters - scale/offset for all the weight and activation tensors of a model, we need to pass representative data through the quantsim.model. You can follow this notebook which shows an example of callback function which passes data through the model for compute_encodings step. Once you have computed encodings, you can export the model to ONNX format using quantsim.export() API

user guide: https://quic.github.io/aimet-pages/releases/latest/user_guide/quantization_sim.html example notebook: https://quic.github.io/aimet-pages/releases/latest/Examples/torch/quantization/qat.html

2) Regarding your TypeError: 'CustomMarker' object is not iterable, we released a fix for it sometime back which isn't part of AIMET 1.27 release I believe. You can either upgrade AIMET version to 1.30 (latest) or apply the patch. https://github.com/quic/aimet/pull/2548

Hope this helps. Please let us know if you have further questions.