Mxbonn / INQ-pytorch

A PyTorch implementation of "Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights"
164 stars 27 forks source link

Why does the output model size become bigger? #7

Closed Aaron4Fun closed 5 years ago

Aaron4Fun commented 5 years ago

After the quantization, the weight parameters are converted into INT8 version and thus the parameter size would be smaller. So, why does my output weight become bigger (75.2 MB to 96.2 MB)? Thanks in advance.

Mxbonn commented 5 years ago

See my reply in #4

This paper and thus the implementation focuses more on the algorithm than on the actual deployment to specific hardware. Thus all parameters are still 32-bit floats but the amount of unique values used is artificially limited to what an N-bit value could use.

See also this issue in the implementation of the authors of the paper: AojunZhou/Incremental-Network-Quantization#36