Closed airacid closed 2 years ago
@airacid
Hi, thanks for your recognition of our work. zero_point
needs to be stored in the corresponding data type, such as uint8, so we must ensure that it conforms to the corresponding range to avoid overflow.
@airacid Hi, thanks for your recognition of our work.
zero_point
needs to be stored in the corresponding data type, such as uint8, so we must ensure that it conforms to the corresponding range to avoid overflow.
Thank you for the quick reply. I've seen that some works like https://github.com/skmhrk1209/QuanTorch/blob/804269b8261560130039550d521efabaa1a87f48/quantizers.py save their 'zero_point' and 'scale' in the float type. So, I wonder why the implementations are different? I suppose their work is not fully quantized in an integer type, but yours is? Sorry for the lots of questions.
@airacid Hi, thanks for your recognition of our work.
zero_point
needs to be stored in the corresponding data type, such as uint8, so we must ensure that it conforms to the corresponding range to avoid overflow.Thank you for the quick reply. I've seen that some works like https://github.com/skmhrk1209/QuanTorch/blob/804269b8261560130039550d521efabaa1a87f48/quantizers.py save their 'zero_point' and 'scale' in the float type. So, I wonder why the implementations are different? I suppose their work is not fully quantized in an integer type, but yours is? Sorry for the lots of questions.
The work you cited is aimed at CNNs. In that case, zero_point
of feature maps can be fused into the bias
of Conv
, and then the new bias
can be rounded to int.
However, this is not allowed in some modules of the transformer, such as q@k
in the self-attention module and the calculation of the mean and variance in our IntLayerNorm. So in our work, we save zero_ point
as uint8.
Hi, thanks for the wonderful works of your paper and code. I was looking into your code and I couldn't understand why you need to clamp the zero point in the range of qmin and qmax. I lack knowledge of this field and hope that you can explain it for me, please.
https://github.com/linyang-zhh/FQ-ViT/blob/16122ee7ea33e80aed3edd29cfebb3ab2ce2cb69/models/ptq/observer/minmax.py#L49