tensorflow / model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
https://www.tensorflow.org/model_optimization
Apache License 2.0
1.49k stars 323 forks source link

4bit trained model with TFMOT not able to convert to tflite with 4bit data range #574

Closed joyalbin closed 3 years ago

joyalbin commented 4 years ago

Hi facing an issue with TfLiteConverter.

Created a mode through QAT using TFMOT framework with 4bit training. While converting the model to tflite using TfLiteConverter, the model parameters still in INT8 data range

We need to convert the model with 4bit data range and its type in INT8.

Expected Result:

  1. 4bit model, data range [-7, 7]
  2. weights data type INT8

Actual Result:

  1. 4bit model, data range [-127, 127]
  2. weights data type INT8

Please help me how to resolve this.

Thanks Albin

xhae commented 3 years ago

Hi sorry for the huge delay. TFLite working on 4bit quantization, but I doubt currently TFLite can support this with options. @daverim any ideas?

daverim commented 3 years ago

Hi, xhae@ is correct, we don't have a 4bit scheme in TFLite, however, it sounds like you want to convert your 4bit weights to 8bits with the range fixed to -7 to 7. We don't have a feature that can do this because the TFLite converter converts float weights to 8 bit regardless of the configured number of bits in the fake_quant, using only the min/max range.

TLDR: we don't support your use-case currently.