Open guls999 opened 3 years ago
Having the same problem, under watching
me too
Can you share steps to generate those two files - inception_v3.h5
and inception_quant.h5
?
I tried with this colab, but can't reproduce the issue.
Meanwhile, it looks strange that some layers are not quantized - I'll take a look.
I also could not reproduce the different behaviour when converting directly vs converting after loading quantized weights. However there seem to be improperly quantized nodes in the converted TFLite model (regardless of loading quantized weights).
The issue has been fixed in tensorflow>=2.6.0rc0. Please upgrade.
A test has been added to prevent regression.
@fredrec How long can TensorFlow >= 2.6.0RC0 be estimated to support? Do all models support QAT?
I use QAT to finetune inceptionv3 model. I save models when model fit the datas. Then I load model.h5 to restore model. When I convert h5 to tflite, there is a node that the type of input is int8 and output is float32. So, I meet the error when I use tflite model to infer.
2021-01-25 09:00:36.746994: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 Traceback (most recent call last): File "infer_tflite.py", line 13, in
interpreter.allocate_tensors()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/interpreter.py", line 335, in allocate_tensors
return self._interpreter.AllocateTensors()
RuntimeError: tensorflow/lite/kernels/pooling.cc:79 input->type != output->type (INT8 != FLOAT32)Node number 8 (AVERAGE_POOL_2D) failed to prepare.
System information
TensorFlow version: 2.5.0.dev20210124
TensorFlow Model Optimization version: 0.5.0
Python version: 3.6.9
I follow office guide to train and convert model. model = load_model(classic_model, compile=False) quantize_model = tfmot.quantization.keras.quantize_model q_aware_model = quantize_model(model)
Use ModelCheckpoint to save my quantization model and use load_weights to restore model. model = load_model('inception_v3.h5', compile=False) quantize_model = tfmot.quantization.keras.quantize_model q_aware_model = quantize_model(model) q_aware_model.load_weights('inception_quant.h5', by_name=True)
Use tf.lite to convert model. converter = tf.lite.TFLiteConverter.from_keras_model(q_aware_model) converter.optimizations = [tf.lite.Optimize.DEFAULT] quantized_tflite_model = converter.convert()
In face, the type of input and output of the node should be same. But after I used tf.lite.TFLiteConverter.from_keras_model(q_aware_model), there is a error what I described above.
Another questions is if I don't load_weights for new model, I can success convert h5 model to tflite model and use tflite model to infer. If I load_weights for new_model, The above error is happened.
Fig1 model before load weights
Fig2 model after load weights
I can't find difference unless params.
Fig3 tflite model converted by h5 model before load weights
Fig4 tflite model converted by h5 model afterload weights