Open AnastGerus opened 3 years ago
Please rename this file to .onnx, because I can't upload .onnx file to github.
Hi @AnastGerus ,
For LSTM conversion, there is an article and colab describing the process, and you can follow the colab inside that to do it. For 4 kinds of quantizations, maybe you can also try other ways like float16 or dynamic range. Hope it could help.
We cannot guarantee ONNX generated graph can be always converted to TFLite + quantization, as we don't know how initial TF graph is created by ONNX.
Hi @lintian06, Thank you for your attention!
I've seen those articles and colab. In the article in the part "Known issues/limitations" actually written:
- Currently there is support only for converting stateless Keras LSTM (default behavior in Keras). Stateful Keras LSTM conversion is future work.
- It is still possible to model a stateful Keras LSTM layer using the underlying stateless Keras LSTM layer and managing the state explicitly in the user program. Such a TensorFlow program can still be converted to TensorFlow Lite using the feature being described here.
So I've tried to do it another way, but it doesn't work with 'Integer' quantizations. It works correctly with 'float16' and 'dynamic range' quantizations, but 2x smaller isn't enough for me, unfortunately. It would be great to get an x4 smaller model (as described in the article).
I don't have to use ONNX conversion. This issue can be easily reproduced in the collab if you change the LSTM layer to stateful = True
and add Integer optimization.
Maybe you could provide some code example that is described above:
model a stateful Keras LSTM layer using the underlying stateless Keras LSTM layer and managing the state explicitly in the user program.
How exactly can I manage states in C++ code (TF Lite) when I have converted the stateless Keras LSTM layer? Wouldn't it has problems with numbers dynamic range if the calibrator (during TF -> TF Lite conversion) was calculating it for stateless LSTM, but I'll use it as stateful?
Thanks, Best regards, Anastasiia
For the limitation 1, it is because the latent variable is turned into constants, and cannot handle control flow correctly. Can you try stateless Keras LSTM?
If you stick to stateful Keras LSTM, for C++ code, you can extract the subgraph of TF models with only one step ("without for-loop"), and use C++ code to loop the LSTM state. However, you have to handle TFLite conversion in details.
inputs
and outputs
when exporting a saved model, and convert to TFLite with the saved model.from_keras_model
or create a concrete function to do it. It is basically a Kears model nesting inside another, and you only convert the internal one.No matter which way, the TFLite model only contains one step of LSTM, and the state related to init/update/exit is handled by the code.
Hope it could help.
Hi @lintian06 ,
I have tried to do something similar by "fixing" the size of the input variables with concrete_function (code based on collab):
model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(20, stateful=True, time_major=False, return_sequences=True)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
run_model = tf.function(lambda x: model(x))
BATCH_SIZE = 1
STEPS = 28
INPUT_SIZE = 28
concrete_func = run_model.get_concrete_function(
tf.TensorSpec([BATCH_SIZE, STEPS, INPUT_SIZE], tf.dtypes.float32))
# model directory
MODEL_DIR = "\test"
model.save(MODEL_DIR, save_format="tf", signatures=concrete_func)
converter = tf.lite.TFLiteConverter.from_saved_model(MODEL_DIR)
converter.experimental_enable_resource_variables = True
converter.target_spec.supported_ops = [
tf.lite.OpsSet.TFLITE_BUILTINS, # enable TensorFlow Lite ops.
tf.lite.OpsSet.SELECT_TF_OPS # enable TensorFlow ops.
]
def representative_dataset():
for _ in range(100):
data = np.random.rand(BATCH_SIZE, STEPS, INPUT_SIZE)
yield [data.astype(np.float32)]
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
tflite_model = converter.convert()
Still, it doesn't work with Integer Quantization (only Float and DynamicRange). It can work if you remove stateful = True
in LSTM.
If you meant another idea please help with the code example. Thank you for your help! I'm new to LSTM conversion in TF & TFLite.
Best regards, Anastasiia
@AnastGerus I have the same problem converting a model with stateful LSTM layers.
Have you seen this: https://github.com/tensorflow/tensorflow/issues/48282
I think this should be a work around. But didn't figure out how to handle the hidden states of multiple LSTM layers?
Hi @jakobwowy, Thank you for this link, I'll try it.
Regarding NN with multiple LSTM layers: I had a similar problem in the onnx_tf convertor and I used a workaround by splitting my NN into a few NNs each with only one LSTM layer and made the inference sequentially. I hope it will help you!
Thanks, Anastasiia
Unfortunately, this solution didn't help me because I need to work with format converted from ONNX (TensorFlowRep) - it's not Keras...
Hi @lintian06, Could you please provide an example of
defining a signature_def specifying inputs and outputs when exporting a saved model
when I don't have a Keras model? Only the ONNX model or TF frozen graph are available in my case...
Thank you, Anastasiia
Hi @AnastGerus , I have the same problem when convert torch->onnx->tflite. Have you solved the problem?
It seems tensorflow=2.6.0 or tf-nigthtly can convert to tflite but it log error when try to network forward.
Hi @lzc16 , Unfortunately, no. Issue isn't solved. In my case failure is in the process of conversion.
Hi @lzc16 , I have managed to convert my model using the nightly version. But I also have an error in the runtime.
Could not find variable lstm_kernel_lstm_31. This could mean that the variable has been deleted. In TF1, it can also mean the variable is uninitialized. Debug info: container=localhost, status error message=Container localhost does not exist. (Could not find resource: localhost/lstm_kernel_lstm_31)
(while executing 'ReadVariableOp' via Eager)Node number 26 (TfLiteFlexDelegate) failed to invoke.
Node number 42 (WHILE) failed to invoke.
Hi @lintian06 , Could you please recommend a way to solve it? Does it exist?
Thanks
Hi @AnastGerus ,
Thanks for your reply. I have met the same situation with you. Model can be convert to tflite with tfnightly and log similar error log with you. It seems that my problem is tflite can't support the 'FlexVarHandleOp' Op. May be try to use keras define the lstm can solve the problem, but not suit for me that torch->onnx->tflite. As you can try this method.
Best regards!
Hi @lzc16 , I have the same path - 'torch -> onnx -> tflite' so I can't use keras. I just trying to find a solution or some workaround.
Best regards
Hi @AnastGerus , Is there any suggestion that modify tensorflow.pb model manual? Or redefine the net in the tensorflow(using keras) and load the onnx model?
Thanks
Hi @AnastGerus
I got the explain from https://github.com/tensorflow/tensorflow/issues/52041.
Hi @lzc16, I can see from that issue that the problem was in onnx-tf conversion. In my case, the TF model works ok, and the TFLite model with float16 or Dynamic Range quantizations also works fine. So the problem is specific to Integer quantization.
Could you please recommend the way to modify tensorflow.pb model manually?
Thanks
Hi @AnastGerus ,
I try set the tensorflow.pb weights manually follow this author https://github.com/onnx/onnx-tensorflow/issues/971#issuecomment-926467939.
By the way, do you try another way in onnx-tf which causes TF models work? I want to figure out where I made the mistake.
Hi @lzc16 , I don't really get what other way are you talking about... Please specify.
Hi @AnastGerus ,
I'm sorry I didn't describe the problem clearly. I can see you can make it work when set the QUANTIZATION = 'None'.
However, in my case, When I convert onnx to tflite without quantization it had already log error.
Therefore, I'm confused how do you convert torch to onnx and I wondered if I overlooked the key steps.
1. System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
- TensorFlow installation (pip package or built from source): pip package
- TensorFlow library (version, if pip package or github SHA, if built from source): tf-nightly 2.7.0.dev20210819
2. Code
Please check my code below. Please put 'LSTMlayer.onnx' file into folder 'path' (or modify the path). This code fails with the described issue, but if you will change QUANTIZATION = 'None', it will work.
import tensorflow as tf import os import numpy as np path = "\test" QUANTIZATION = 'IntegerWithFloatFallback' # 'IntegerWithFloatFallback' or 'None' def representative_dataset(): dummy = np.zeros((1,1,512), dtype = np.float32) yield [dummy] converter = tf.lite.TFLiteConverter.from_saved_model(path) # we need experimental_enable_resource_variables and "select TensorFlow ops" for # AssignVariableOp, ReadVariableOp, VarHandleOp operations, otherwise you will get an error converter.experimental_enable_resource_variables = True converter.target_spec.supported_ops = [ tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS ] if (QUANTIZATION == 'IntegerWithFloatFallback'): converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.representative_dataset = representative_dataset tflite_model = converter.convert() with open(os.path.join(path, 'model.tflite'), 'wb') as f: f.write(tflite_model)
3. Failure after conversion
File "...\git_issue_code.py", line 26, in tflite_model = converter.convert() File "...\lib\site-packages\tensorflow\lite\python\lite.py", line 763, in wrapper return self._convert_and_export_metrics(convert_func, *args, kwargs) File "...\lib\site-packages\tensorflow\lite\python\lite.py", line 749, in _convert_and_export_metrics result = convert_func(self, *args, *kwargs) File "...\lib\site-packages\tensorflow\lite\python\lite.py", line 1031, in convert return self._optimize_tflite_model( File "...\lib\site-packages\tensorflow\lite\python\convert_phase.py", line 226, in wrapper raise error from None # Re-throws the exception. File "...\lib\site-packages\tensorflow\lite\python\convert_phase.py", line 216, in wrapper return func(args, kwargs) File "...\lib\site-packages\tensorflow\lite\python\lite.py", line 714, in _optimize_tflite_model model = self._quantize( File "...\lib\site-packages\tensorflow\lite\python\lite.py", line 517, in _quantize calibrate_quantize = _calibrator.Calibrator(result, File "...\lib\site-packages\tensorflow\lite\python\optimize\calibrator.py", line 78, in init raise ValueError("Failed to parse the model: %s." % e) ValueError: Failed to parse the model: Op FlexVarHandleOp missing inputs.
I have also reproduced this issue using the Keras LSTM layer directly. AssignVariableOp, ReadVariableOp, VarHandleOp are needed when you use
stateful=True
for LSTM layer (that is very important option for "infinite" data (e.g. audio stream)). Please contact me if you need some additional info.Thanks, Best regards, Anastasiia
Hi @lzc16,
I didn't convert torch to onnx, I've used export
function.
1. System information
2. Code
Please check my code below. Please put 'LSTMlayer.onnx' file into folder 'path' (or modify the path). This code fails with the described issue, but if you will change QUANTIZATION = 'None', it will work.
3. Failure after conversion
File "...\git_issue_code.py", line 26, in
tflite_model = converter.convert()
File "...\lib\site-packages\tensorflow\lite\python\lite.py", line 763, in wrapper
return self._convert_and_export_metrics(convert_func, *args, kwargs)
File "...\lib\site-packages\tensorflow\lite\python\lite.py", line 749, in _convert_and_export_metrics
result = convert_func(self, *args, *kwargs)
File "...\lib\site-packages\tensorflow\lite\python\lite.py", line 1031, in convert
return self._optimize_tflite_model(
File "...\lib\site-packages\tensorflow\lite\python\convert_phase.py", line 226, in wrapper
raise error from None # Re-throws the exception.
File "...\lib\site-packages\tensorflow\lite\python\convert_phase.py", line 216, in wrapper
return func(args, kwargs)
File "...\lib\site-packages\tensorflow\lite\python\lite.py", line 714, in _optimize_tflite_model
model = self._quantize(
File "...\lib\site-packages\tensorflow\lite\python\lite.py", line 517, in _quantize
calibrate_quantize = _calibrator.Calibrator(result,
File "...\lib\site-packages\tensorflow\lite\python\optimize\calibrator.py", line 78, in init
raise ValueError("Failed to parse the model: %s." % e)
ValueError: Failed to parse the model: Op FlexVarHandleOp missing inputs.
I have also reproduced this issue using the Keras LSTM layer directly. AssignVariableOp, ReadVariableOp, VarHandleOp are needed when you use
stateful=True
for LSTM layer (that is very important option for "infinite" data (e.g. audio stream)). Please contact me if you need some additional info.Thanks, Best regards, Anastasiia