quic / aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
https://quic.github.io/aimet-pages/index.html
Other
2.08k stars 373 forks source link

Weight type is not int after Adaround Aauntization #2697

Open IJS1016 opened 7 months ago

IJS1016 commented 7 months ago

Hi, I have question for adaround quantization.

I applied adaround qauntization to my model.

the encodings file saids, weight's dtype is int. like this. "activation_encodings": { "150": [ { "bitwidth": 8, "dtype": "int", "is_symmetric": "False", "max": 3.8640635013580322, "min": 0.0, "offset": 0, "scale": 0.01515319012105465 } ], "154": [ { "bitwidth": 8, "dtype": "int", "is_symmetric": "False", "max": 3.8640635013580322, "min": 0.0, "offset": 0, "scale": 0.01515319012105465 } ],

but, the wieght's dtype is not int, float like this. ` [[ 0.0477, 0.0636, 0.3232], [-0.2808, -0.3020, -0.3497], [-0.1007, 0.2331, -0.4822]],

     [[-0.1007, -0.2861, -0.4186],
      [-0.3126,  0.0053,  0.3232],
      [-0.1378, -0.1113,  0.0371]],

     [[-0.0159, -0.1643,  0.3073],
      [-0.1960,  0.1166, -0.0742],
      [ 0.1643,  0.1643,  0.0318]],

     [[ 0.6305,  0.0477, -0.2331],
      [-0.3232,  0.0424, -0.0265],
      [ 0.0477, -0.1272, -0.0212]],

     [[ 0.0212, -0.2384,  0.1325],
      [ 0.2914,  0.2808,  0.1007],
      [-0.0742,  0.4822, -0.3391]]],` 

Internally, I can see that it is calculated using offset and scale and changes to int8. But I want to make sure that everyone only performs int operations inside, without int convertion process. How do I convert the real type of weight to int in torch or onnx without Qualcomm SDK?

thanks,

quic-mangal commented 6 months ago

@IJS1016, the tools AIMET provide do quantization simulation, wherein we end up doing quantization and dequantization. So the resulting weight is kept in float. To see the actual quantization happening, we suggest users using Qualcomm SDK.

kgeeting commented 6 months ago

@quic-mangal I had the same question as @IJS1016, as I'm seeing similarly when reviewing the output model after adaround & AutoQuant. Given that I have an ultimate interest in the integer version of these weights, could you provide links/guidance on what you mean by using Qualcomm SDK to get these?

quic-mangal commented 6 months ago

@kgeeting, you could try https://developer.qualcomm.com/software/qualcomm-neural-processing-sdk/getting-started