quic / aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
https://quic.github.io/aimet-pages/index.html
Other
2.12k stars 382 forks source link

Significant Precision Loss Occurs Through DepthWiseConv2D on DSP #2474

Open xiexiaozheng opened 1 year ago

xiexiaozheng commented 1 year ago

aimet version: 1.28 SNPE version: 2.14 deploy platform: SM8550 DSP w8 a8 bias32

I have a model, and the backbone of this model is MobileNetV3. As you know, MobileNetV3 primarily consists of pointwise and depthwise convolutions. I used AIMET 1.28 to implement Quantization-Aware Training (QAT) on this model, and the QAT model achieved the same accuracy as the FP32 model.The QAT configuration is as follows:

"defaults": {
        "ops": {
            "is_output_quantized": "True"
        },
        "params": {
            "is_quantized": "True"
        },
        "strict_symmetric": "False",
        "unsigned_symmetric": "True",
        "per_channel_quantization": "False"
    },
    "params": {
        "bias": {
            "is_quantized": "False"
        }
    },
    "op_type": {},

weight bitwidth = 8, activation bitwidth = 8

I exported the model in ONNX format along with a statistics file and converted it to DLC. When testing on the DSP (Digital Signal Processor), I observed a significant drop in accuracy.

To identify which layer of the model is causing this issue, my debugging approach is as follows: I divided the entire network into two parts, the first part runs on the DSP, and the second part runs on the QAT model generated by QuantizationSimModel with input from DSP. After conducting these tests, I noticed the following phenomenon: every time the model passes through a depthwise convolution, the accuracy decreases (initially with fewer channels, the drop is not significant). This is especially noticeable in one specific depthwise convolution layer with 672 channels, where the accuracy drops by half. Is this because I am using per-tensor quantization? Interestingly, my QAT model doesn't suffer from accuracy degradation; the significant drop in accuracy only occurs when deployed on the DSP. Additionally, each layer of depthwise convolution is followed by the activation function hardswish, and I added ["Conv", "HardSwish"] in the supergroups section of the configuration.

The quantization statistics for the weights of the depthwise convolution and the quantization statistics for the output of the hardswish activation both come from the encoding file exported from AIMET, while the quantization statistics for bias come from command tool of snpe-dlc-quant.

The structure was shown as below. test

quic-akinlawo commented 1 year ago

Hello @xiexiaozheng, per channel quantization is recommended for convolution-like operations to increase the resolution.

xiexiaozheng commented 1 year ago

Hello @xiexiaozheng, per channel quantization is recommended for convolution-like operations to increase the resolution.

@quic-akinlawo But the accuracy of QAT model is ok, the accuracy decrease when the model deploy to DSP

1SingleFeng commented 11 months ago

Is there any progress on this matter? I am very curious as to what led to this result.

pubuzai commented 2 months ago

@1SingleFeng hi~,I encountered the same issue. Have you found the cause of the problem?