tensorflow / model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
https://www.tensorflow.org/model_optimization
Apache License 2.0
1.49k stars 320 forks source link

Weight in fully connected layers don't follow tensorflow quantization spec (zero-point!=0) #822

Open bhbruce opened 3 years ago

bhbruce commented 3 years ago

1. System information

2. Code

Provide code to help us reproduce your issues using one of the following options: 1) Demonstrate how to build your TF model: I download the quantize-aware training int8 model from repo goolge-research/mobilebert. The model download link is download link. 2) Please follow this colab page to convert the model.

3. Failure after conversion

abattery commented 3 years ago

@thaink @teijeong @daverim for the visibility.

Xhark commented 3 years ago

The weight input for FC op is not a weight. That's why we don't use symmetric. I think this is a corner case that TF EinsumDense convert to some TFLite FC ops because TFLite doesn't have matmul op. but it seems violate quantization spec. @teijeong Do you have any idea why it starts not working on TF 2.6+?

bhbruce commented 3 years ago

@Xhark Thanks for your response! @teijeong Is there any update for this issue?