Closed thuako closed 3 years ago
Hi, thanks a lot for your interest. We use the standard quantizer without optimizing the quantization range (aka the scale parameter), because we think it would be more general in terms of the algorithm. Otherwise, it's hard to tell whether the accuracy improvement comes from our method or the clipping/learnable quantizer methods. Since these methods are orthogonal, I think combining them would not cause problems. Though the gain by adding gradient-based clipping methods may not be significant since we are using 4/8-bit, which are larger than binary/ternary where a wise quantization range is the key. Hope these are helpful.
Thank you for reply :)
@Zhen-Dong
Hi, thanks a lot for your interest. We use the standard quantizer without optimizing the quantization range (aka the scale parameter), because we think it would be more general in terms of the algorithm. Otherwise, it's hard to tell whether the accuracy improvement comes from our method or the clipping/learnable quantizer methods. Since these methods are orthogonal, I think combining them would not cause problems. Though the gain by adding gradient-based clipping methods may not be significant since we are using 4/8-bit, which are larger than binary/ternary where a wise quantization range is the key. Hope these are helpful.
Does it ok to use the same scaling factor even if input data is different? I can't find calculating scaling factor code in quantized model. It only use fixed scaling factors.
I think your method is using fixed scaling factor of input data and input of layers in tvm relay code But, if it uses fixed scaling factor of input data. I think it will effect bad accuracy or same result of inference.
Hi, I want to mix your HAWQ-v3 and QNN which implement custom gradient in scale parameters, like PACT, QIL, LSQ.
I wonder if why didn't you tried to those scale paramter with gradient.
Is there any problem with training? or something else?
I would appreciate for you reply.