quic / aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
https://quic.github.io/aimet-pages/index.html
Other
2.13k stars 382 forks source link

When is_symmetric of Quantization Simulation Configuration file is True, offset in the encoding file is not 0 #2220

Open wangjun-0 opened 1 year ago

wangjun-0 commented 1 year ago

I used AIMET quantize the model. When is_symmetric of Quantization Simulation Configuration file is True, the encoding file of quantization result is not 0. It is -128.

Can you tell me the reason? I think offset is always 0 when using symmetric.

quic-mangal commented 1 year ago

@wangjun-0, yes, it should be 0. Can you print(sim) and see if all quantizers are using symmetric mode?

wangjun-0 commented 1 year ago

@quic-mangal ,Thanks for your answer. When I print(sim), it prints <aimet_tensorflow.keras.quantsim.QuantizationSimModel object at 0x7fd4205a0a30>. So I can not see the details.

My quantsim config file like this: { "defaults": { "ops": { "is_output_quantized": "True", "is_symmetric": "False" }, "params": { "is_quantized": "True", "is_symmetric": "True" }, "strict_symmetric": "False", "per_channel_quantization": "False" },

"params": { "weight": { "is_quantized": "True", "is_symmetric": "True" }, "bias": { "is_quantized": "True", "is_symmetric": "True" } },

"op_type": { "Squeeze": { "is_output_quantized": "True" }, "Pad": { "is_output_quantized": "True" }, "Mean": { "is_output_quantized": "True" } },

"supergroups": [ ],

"model_input": { "is_input_quantized": "True" },

"model_output": { "is_output_quantized": "True" }

After running sim.compute_encodings() and sim.export(), it generated a float point model and a encoding file.

The param_encodings of encoding file like this: param_encodings: conv1_conv/bias:0:

is_symmetric is True, the offset is -128.

I want to know how can I get the fix point values using the exporting float point model and encoding file?

quic-mangal commented 1 year ago

Oh, I should have asked which Framework are you using. I was assuming PyTorch.

The quantsim is a simulator, so when you export we expect that you will take the models + encodings to target. To get back a quantized model, you can create a sim again and unfortunately for Keras we don't have a similar API like load_encodings_to_sim supported. So you will have to compute the encodings again.

wangjun-0 commented 1 year ago

@quic-mangal Thanks for your answer. I am using Keras model for quantization. It works fine. Now I can quantize the Keras model. Can I use the exporting float point model and encoding file to get fix point model that the model is not a sim just a Keras model like fix point model?

quic-mangal commented 1 year ago

Can I use the exporting float point model and encoding file to get fix point model that the model is not a sim just a Keras model like fix point model?

It is not possible to do that using AIMET currently. You can manage to quantize the weights by using the encodings but you won't be able to do much for the output/activations.

Linking QNN, if that is something that will be applicable for your use case- https://developer.qualcomm.com/software/qualcomm-neural-processing-sdk

quic-hitameht commented 1 year ago

@wangjun-0 To answer your previous question, adding to what @quic-mangal said, the offset is actually quantized value corresponding to floating value 0.0 and we want to get our data in the range of [0, 2^bitwidth- 1], So if the data has negative values, then it is used to offset the range. For example, in your case, the offset is -128, then for the range (-127, 127), the negative values let's say from -127 to -1 will be mapped to 1(-127 - (-128)) to 127(-1 - (-128)) and positive values from 0 to 127 will be mapped to 128(0 - (-128)) to 255 (127 - (-128)).

wangjun-0 commented 1 year ago

@quic-hitameht Thank you for your answer. I think symmetric quantization quantizes the float values to [-127,128] and asymmetric quantization quantizes the float values to [0, 255]. But from your answer the quantized values are [0,255], so it can not distinguish symmetric and asymmetric quantization. Can you give me a sample for symmetric and asymmetric quantization and give me the formula for symmetric and asymmetric quantization? So I can understand the quantization details. Thanks.