tensorflow / model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
https://www.tensorflow.org/model_optimization
Apache License 2.0
1.49k stars 320 forks source link

Input and resource quantization #1003

Open Jwy-Leo opened 2 years ago

Jwy-Leo commented 2 years ago

System information

Motivation The input and resource need interface for customer layer quantization, Normalization is a basic structure in the DNN, but we cannot quantize it easily.

  1. No input quantization interface in customer configuration, it will make quantization mul-add structure in float32 instead of int8 in tflite file.
  2. No resource quantization interface in customer configuration, it will make batchnorm - (running mean, running variance) wo/ QAT.
Xhark commented 2 years ago

Hi, would you please give us some examples? We usually assume BNs would be folded (fused) to nearby layer for optimization. I'd like to know some use-cases that when it required. Thanks.