How to convert from a quantization-aware training model to a post-training quantization model?

IntelLabs / distiller

Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller

Apache License 2.0

4.34k stars 799 forks source link

How to convert from a quantization-aware training model to a post-training quantization model? #288

Open zhengge opened 5 years ago

dongzhen123 commented 5 years ago

want to ask the same question

robotcator commented 5 years ago

It seems that converting from a quantization-aware training model to a post-training quantization model is not yet supported by the document. https://nervanasystems.github.io/distiller/algo_quantization.html

Is there any plan to do this?

asti205 commented 4 years ago

I (and I think from the Issues entries, many others as well) would also be interested :)

levzlotnik commented 4 years ago

Hi,

Sorry for the really late response... The way QAT was implemented - the model is re-quantized each minibatch run. So, in the end of the training, the model is actually already quantized and ready to use, i.e. the .weights are already quantized and so are the activations. No need to PostTrainQuantize the model.

shazib-summar commented 4 years ago

@zhengge @dongzhen123 could you please elaborate what you mean when you say "convert the model", and also what do you aim to accomplish from this?

Thanks.