Open ulfhanebutte opened 3 months ago
hello, ran into a problem that onnxruntime c QDQ format on openvino executer does not work asymmetric quantization. Can you please elaborate on which accelerators do not support asymmetric quantization. And what do you mean by accelerators in your contex? Maybe you know openvino for QDQ format really does not support symmetric quantization? And why?
Describe the feature request
QDQ process includes symmetric quantization and asymmetric quantization by introducing the zero-offset. Many accelerators do not support zero-offset and thus symmetric quantization is need, which is not idea for tensors that are strictly positive, e.g. an output tensor after RELU activation function. The requested feature is to allow tensors to be int8 or uint8 and use the uint8 for tensors that are strictly positive. This is equivalent to uint8 with either zero_point 128 or 0.
Describe scenario use case
An example is a tensor after the RELU or Sigmoid activation function. Both function guarantee that the tensor values all are positive. The new restricted asymmetric quantization mode would provide an zero_point of 0 for the tensor stored in uint8 and all tensors that have negative and positive values would be represented with uint8 and zero_point offset of 128. As this new mode restrict to only these two cases, an accelerator HW that supports int8 and uint8 tensors can use this new restricted asymmetric quantization mode.