In quantization, why we must ensure that 0 is is in the activation/weight range.

microsoft / nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

https://nni.readthedocs.io

MIT License

14.02k stars 1.81k forks source link

In quantization, why we must ensure that 0 is is in the activation/weight range. #3493

Closed chenbohua3 closed 2 years ago

chenbohua3 commented 3 years ago

In function update_quantization_param, there are two lines of codes:

rmin = torch.min(rmin, torch.Tensor([0]).to(rmin.device))
rmax = torch.max(rmax, torch.Tensor([0]).to(rmin.device))

I understand that if it is for quantization of activation, we must make sure that 0 is in the range, because there might be zero padding. However, if it is for quantization of weight, I can not figure out why we should keep 0 in the range.

linbinskn commented 3 years ago

Make sense! The implementation of current NNI quantization refers to gemmlowp and it extends range to include 0 in both weights quantization and activation quantization. But it doesn't mean that this implementation is optimal. Since weights have nothing to do with padding, I think keeping the original min and max value maybe more reasonable. We will discuss it and find better way to handle it. We also look forward to your PR submission to improve this topic.

chenbohua3 commented 3 years ago

I'll handle the topic later this weekend

scarlett2018 commented 3 years ago

@chenbohua3 / @linbinskn / @ultmaster - has the above fixed got in the recent candidate release?

J-shang commented 2 years ago

close this issue because this is a default behavior in PyTorch quantization observer.