Open bhbruce opened 3 years ago
@thaink @teijeong @daverim for the visibility.
The weight input for FC op is not a weight. That's why we don't use symmetric. I think this is a corner case that TF EinsumDense convert to some TFLite FC ops because TFLite doesn't have matmul op. but it seems violate quantization spec. @teijeong Do you have any idea why it starts not working on TF 2.6+?
@Xhark Thanks for your response! @teijeong Is there any update for this issue?
1. System information
2. Code
Provide code to help us reproduce your issues using one of the following options: 1) Demonstrate how to build your TF model: I download the quantize-aware training int8 model from repo goolge-research/mobilebert. The model download link is download link. 2) Please follow this colab page to convert the model.
QAT INT8 mobilebert tensorflow model: download link. Untar the file and then you can find model is in "mobilebert_squad_savedmodels/quant_saved_model".
Converted INT8 tflite model: download link
3. Failure after conversion
Model produces wrong results: FC layers zero-point != 0. These don't follow quantization spec.
Fail to convert the model to tflite, only tf-2.5.0 can successfully convert to INT8 tflite model. In other words, tf2.6 cannot work.