Input and resource quantization

tensorflow / model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

Apache License 2.0

1.49k stars 320 forks source link

System information

TensorFlow version (you are using): tensorflow 2.9.1
Are you willing to contribute it (Yes/No): No

Motivation The input and resource need interface for customer layer quantization, Normalization is a basic structure in the DNN, but we cannot quantize it easily.

No input quantization interface in customer configuration, it will make quantization mul-add structure in float32 instead of int8 in tflite file.
No resource quantization interface in customer configuration, it will make batchnorm - (running mean, running variance) wo/ QAT.

tensorflow / model-optimization

Input and resource quantization #1003