Open cratajczak-EMM opened 2 weeks ago
Hey @cratajczak-EMM , I am not sure if this is an issue but when converting a model using the tflite.converter you can add these extra options
inference_input_type
inference_output_type
https://www.tensorflow.org/api_docs/python/tf/lite/TFLiteConverter
The default implementation is usually done to avoid having to change the input data source when using quantization but if you need to give 8 bit data the above options should solve it
Hello ,
Sorry, this is not really an issue, but rather a question
Is there any way to perform integer-only inference? From what I can see, even though the weights are quantized (from 32-bit to a 8-bit-width), there is a dequantize operation before performing the mathematical operations, followed by a quantize operation to output the data from the layer.
Best regards
Christophe