Closed chenbohua3 closed 2 years ago
Make sense! The implementation of current NNI quantization refers to gemmlowp and it extends range to include 0 in both weights quantization and activation quantization. But it doesn't mean that this implementation is optimal. Since weights have nothing to do with padding, I think keeping the original min and max value maybe more reasonable. We will discuss it and find better way to handle it. We also look forward to your PR submission to improve this topic.
I'll handle the topic later this weekend
@chenbohua3 / @linbinskn / @ultmaster - has the above fixed got in the recent candidate release?
close this issue because this is a default behavior in PyTorch quantization observer.
In function
update_quantization_param
, there are two lines of codes:I understand that if it is for quantization of activation, we must make sure that 0 is in the range, because there might be zero padding. However, if it is for quantization of weight, I can not figure out why we should keep 0 in the range.