Closed akrapukhin closed 10 months ago
There are many assumptions on TFLite converter for QAT model which reflect to the TFLite quantization op specs here: https://www.tensorflow.org/lite/performance/quantization_spec
For concat op cases, There's a restriction (restriction: Input and outputs must all have same scale/zero_point) on the spec above.
For QAT case, we usually remove all the quantization related thing to input of concat op, and move them to output of concat op. TFLite doesn't supports the case when inputs of concat op have different quantization params.
I'd like to recommend you to move quantization to output side of concat, or add some identity like ops as you did for inputs of concat op explicitly.
@akrapukhin @Xhark cc. @yyoon Hello, I fixed this issue in this commit. Please let me know if you encounter this issue again. Thanks for your report and patience.
Describe the bug When there is a concat layer in a model, tflite convert() might generate an error like this:
However, if I add a superfluous dimension to each input tensor going to concat, and then remove this dimension after concat (see RESHAPE_TRICK below), the model is converted.
System information
TensorFlow version: 2.12.0
TensorFlow Model Optimization version (binary): 0.7.3
Python version: 3.9.16
Describe the expected behavior A model with a concat layer should be quantized without strange reshape tricks.
Describe the current behavior An error occurs when calling convert()
Code to reproduce the issue if RESHAPE_TRICK is false, the code will produce the error. If set to True, the model will be converted successfully.
Additional context I also tried to make both inputs to the concat layer to have the same minmax range to force the scales to be identical (see commented code before converter), but it didn't work because the scales are a bit different for some reason: