Closed jasonravagli closed 1 year ago
Hello @jasonravagli, Currently, we only support exporting TensorFlow models using fake quantization weights. We do not recommend using TFLite optimizations as it may change how MCT quantizes the weights. Feel free to let us know if you have further questions or concerns.
Thank you for your quick response. If I get what you are saying, there is no direct way to convert a TensorFlow quantized model generated by MCT to a TFLite model. MCT already has very interesting functionalities, and adding this one would facilitate its use in the deployment of models on edge devices.
Thank you again for your time.
Hi @jasonravagli, A new method for exporting TFLite int8 models from MCT has recently been added and will be available in the upcoming release. Please keep in mind that this is an experimental feature and is subject to future changes.
You can find more information and usage example here. If you have any questions or issues, please let us know.
Issue Type
Documentation
Source
pip (model-compression-toolkit)
MCT Version
1.7.1
OS Platform and Distribution
No response
Python version
No response
Describe the issue
As mentioned in #273, the MCT quantization produces fakely quantized models with float32 weights and currently does not support models with int8 weights. However, from the code and the documentation it is not clear to me how MCT integrates with TFLite to produce models with integer weights for deployment on edge devices.
The TFLite full-integer quantization of a model with float weights requires a PTQ process with a representative dataset. Will this process ruin the MCT quantization? Are there any other methods to convert an MCT quantized Keras model to a TFLite model with integer weights?
Expected behaviour
No response
Code to reproduce the issue
Log output
No response