Open josephrocca opened 2 years ago
I don't have permission to push my changes to your private branch. Please try:
Add the two lines below into tf2onnx/tflite_handlers/tfl_direct.py file after line 93
@tfl_op("TFL_READ_VARIABLE", tf_op="ReadVariableOp") @tfl_op("TFL_VAR_HANDLE", tf_op="VarHandleOp")
After this, please try your patch again.
@fatcat-z Thanks for your response! Do you mean after line 91? If so, I tried that and weirdly it didn't work - same error message logs as before:
https://colab.research.google.com/gist/josephrocca/5af909bd240264cdecd4598903be8dfa
...
ERROR - tf2onnx.tfonnx: Tensorflow op [first_layerconv/states1: TFL_VAR_HANDLE] is not supported
...
ERROR - tf2onnx.tfonnx: Tensorflow op [streamable_model_12/first_layerconv/concat/ReadVariableOp: TFL_READ_VARIABLE] is not supported
...
I've added you as a collaborator to that repo in case you wanted to try any changes yourself, but please feel free to suggest other things for me to try. Thanks for your help with this :pray:
Yes, after line 91.
Did you run python setup.py develop to install the local tf2onnx version in you test environment?
@fatcat-z Oh, I was installing it like this, as you can see in the linked Colab above:
!pip install git+https://github.com/josephrocca/tensorflow-onnx.git@patch-1
but I just tried this:
!git clone --branch patch-1 https://github.com/josephrocca/tensorflow-onnx
%cd /content/tensorflow-onnx
!python setup.py develop
and the same errors occurred. I'm a bit of a Python noob (I come from the web/JS world), so please excuse my incompetence here 😬
I'm using your branch to convert that tflite model to onnx and got below error in the result:
2022-10-14 14:49:15,567 - ERROR - Unsupported ops: Counter({'VarHandleOp': 14, 'ReadVariableOp': 14})
This is expected, because we didn't implement these 2 TF ops yet. I'll try to see if it could be done soon.
@josephrocca ,
Please try the code in this branch, this commit. Those ops are designed for training which are not supported by tf2onnx and won't impact the inference results. Removing them should has no impact to the final inference results.
Please leverage it generate a new onnx file and see if the results are correct.
Thanks!! That solved it (EDIT: See below for some complications). Really appreciate how fast you managed to make this fix 🙏
There are some other down-stream issues in ORT Web preventing the model from running correctly but I think they might be specific to ORT Web, rather than this conversion process. I'll post a separate issue if I can't work out what's going wrong there.
(I'll leave it to you to re-open this if you'd like to keep this open until it's fully confirmed that these changes didn't affect the correctness of the inference results.)
Thanks!! That solved it. Really appreciate how fast you managed to make this fix 🙏
There are some other down-stream issues in ORT Web preventing the model from running correctly but I think they might be specific to ORT Web, rather than this conversion process. I'll post a separate issue if I can't work out what's going wrong there.
(I'll leave it to you to re-open this if you'd like to keep this open until it's fully confirmed that these changes didn't affect the correctness of the inference results.)
Do you mind running ORT in your local machine to confirm the correctness? Anyway, please feel free to update this thread with any information you have about the correctness.
:+1: I've added a correctness check (and reminder to comment here) to the todo list for this project.
@fatcat-z An update on this: I tried to check correctness, and while the tflite file works fine, both ORT Python and ORT Web are throwing an error using the converted ONNX model. Here's a full minimal reproduction of the tflite inferene --> conversion to onnx --> onnx inference:
RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node.
Name:'streamable_model_12/first_layerconv/conv1d_36/BiasAdd;streamable_model_12/first_layerconv/conv1d_36/Conv1D/Squeeze;streamable_model_12/first_layerconv/conv1d_36/BiasAdd/ReadVariableOp;Conv1D;streamable_model_12/first_layerconv/conv1d_36/Conv1D__39'
Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:41 onnxruntime::ReshapeHelper::ReshapeHelper(const onnxruntime::TensorShape&, onnxruntime::TensorShapeVector&, bool) gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false.
The input tensor cannot be reshaped to the requested shape. Input shape:{1,320,1}, requested shape:{1,1,1,368}
I originally posted a question about this here: https://github.com/microsoft/onnxruntime/issues/13383 But it looks like it might be a conversion issue rather than a runtime issue, since I see a ReadVariableOp
in that error message?
This should be a conversion issue that some information was lost. Working on a fix.
Hello, I am trying to convert a tflite float16 model into ONNX. I have tried the mentioned change to avoid the outputnames issue. But I also got ReadVariableOp error. Here is the error information:
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_2/Variable_11: TFL_VAR_HANDLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_2/Variable1: TFL_VAR_HANDLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_1/Variable_11: TFL_VAR_HANDLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_1/Variable1: TFL_VAR_HANDLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [lstm/Variable_11: TFL_VAR_HANDLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [lstm/Variable1: TFL_VAR_HANDLE] is not supported
ERROR:tf2onnx.tfonnx:Unsupported ops: Counter({'TFL_VAR_HANDLE': 6})
ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_2/Variable_1: TFL_VAR_HANDLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [model/lstm_2/Read/ReadVariableOp: TFL_READ_VARIABLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_2/Variable: TFL_VAR_HANDLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [model/lstm_2/Read_1/ReadVariableOp: TFL_READ_VARIABLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_1/Variable_1: TFL_VAR_HANDLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [model/lstm_1/Read/ReadVariableOp: TFL_READ_VARIABLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_1/Variable: TFL_VAR_HANDLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [model/lstm_1/Read_1/ReadVariableOp: TFL_READ_VARIABLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [lstm/Variable_1: TFL_VAR_HANDLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [model/lstm/Read/ReadVariableOp: TFL_READ_VARIABLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [lstm/Variable: TFL_VAR_HANDLE] is not supported
ERROR:tf2onnx.tfonnx:Tensorflow op [model/lstm/Read_1/ReadVariableOp: TFL_READ_VARIABLE] is not supported
ERROR:tf2onnx.tfonnx:Unsupported ops: Counter({'TFL_VAR_HANDLE': 6, 'TFL_READ_VARIABLE': 6})
Althrough the tflite fp16 model is converted into ONNX, it seems not expected. Any updates about the fixing? Thanks.
Hello, I am trying to convert a tflite float16 model into ONNX. I have tried the mentioned change to avoid the outputnames issue. But I also got ReadVariableOp error. Here is the error information:
INFO: Created TensorFlow Lite XNNPACK delegate for CPU. ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_2/Variable_11: TFL_VAR_HANDLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_2/Variable1: TFL_VAR_HANDLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_1/Variable_11: TFL_VAR_HANDLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_1/Variable1: TFL_VAR_HANDLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [lstm/Variable_11: TFL_VAR_HANDLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [lstm/Variable1: TFL_VAR_HANDLE] is not supported ERROR:tf2onnx.tfonnx:Unsupported ops: Counter({'TFL_VAR_HANDLE': 6}) ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_2/Variable_1: TFL_VAR_HANDLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [model/lstm_2/Read/ReadVariableOp: TFL_READ_VARIABLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_2/Variable: TFL_VAR_HANDLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [model/lstm_2/Read_1/ReadVariableOp: TFL_READ_VARIABLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_1/Variable_1: TFL_VAR_HANDLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [model/lstm_1/Read/ReadVariableOp: TFL_READ_VARIABLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [lstm_1/Variable: TFL_VAR_HANDLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [model/lstm_1/Read_1/ReadVariableOp: TFL_READ_VARIABLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [lstm/Variable_1: TFL_VAR_HANDLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [model/lstm/Read/ReadVariableOp: TFL_READ_VARIABLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [lstm/Variable: TFL_VAR_HANDLE] is not supported ERROR:tf2onnx.tfonnx:Tensorflow op [model/lstm/Read_1/ReadVariableOp: TFL_READ_VARIABLE] is not supported ERROR:tf2onnx.tfonnx:Unsupported ops: Counter({'TFL_VAR_HANDLE': 6, 'TFL_READ_VARIABLE': 6})
Althrough the tflite fp16 model is converted into ONNX, it seems not expected. Any updates about the fixing? Thanks.
The solution is there and I'm working on code. The current ETA would be this week because of some unexpected things.
Thank you for updating. Look forward to it.
@fatcat-z An update on this: I tried to check correctness, and while the tflite file works fine, both ORT Python and ORT Web are throwing an error using the converted ONNX model. Here's a full minimal reproduction of the tflite inferene --> conversion to onnx --> onnx inference:
RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'streamable_model_12/first_layerconv/conv1d_36/BiasAdd;streamable_model_12/first_layerconv/conv1d_36/Conv1D/Squeeze;streamable_model_12/first_layerconv/conv1d_36/BiasAdd/ReadVariableOp;Conv1D;streamable_model_12/first_layerconv/conv1d_36/Conv1D__39' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:41 onnxruntime::ReshapeHelper::ReshapeHelper(const onnxruntime::TensorShape&, onnxruntime::TensorShapeVector&, bool) gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{1,320,1}, requested shape:{1,1,1,368}
I originally posted a question about this here: microsoft/onnxruntime#13383 But it looks like it might be a conversion issue rather than a runtime issue, since I see a
ReadVariableOp
in that error message?
I believe that the variable operations are not only for training but also for inference. The model is using this as memory for each inference and the value from the last inference is concatenated into the current inference. The problem in shape is most likely coming from the fact that these values [1,48,1] from the ReadVariableOp are missing. Unfortunately I think these operations will have to be supported if you want to convert your model.
@josephrocca
Probably you can access the private branch for a local debug.
@fatcat-z That seems like it might have worked - it definitely fixed the error messages that I was getting previously.
A conversion notebook that includes an inference comparison between the original tflite model and the new onnx model: https://colab.research.google.com/gist/josephrocca/ecb5a2faf54b06eb700ecc562557c6a9/onnx-runtime-python-inference.ipynb#scrollTo=6dhtRr-ru03Q
TFLite outputs:
[[[ 0.7904744 10.276169 28.19359 -0.5269828 0.5269828
-10.803152 5.0063386 14.755524 -3.6888812 16.863457
14.22854 28.19359 -11.593626 -19.23488 2.107932
-10.803152 3.1618981 15.546 -22.396778 12.3841
-13.96505 -14.228542 13.4380665 -0.5269828 -1.8444405
-12.120609 17.39044 -5.0063386 5.0063386 11.066643
11.857117 5.533322 -21.342813 -1.8444405 -2.3714237
2.8984065 -11.066643 0.7904744 -0.26349163 -16.863457
-2.3714237 -36.098335 -5.0063386 -5.796813 10.276169
28.19359 3.6888814 -21.869797 4.2158647 3.952373
3.4253898 -3.9523726 -10.012678 0.5269828 -10.803152
-7.1142707 -1.053966 6.060305 5.7968135 1.5809493
2.3714237 -2.634915 4.742847 -5.0063386 ]]]
ONNX outputs:
[[[ -0.526983 7.1142707 28.193592 -2.107932 0.526983
-13.701559 1.3174576 12.647593 -6.8507795 16.863457
14.755525 28.193592 -10.803152 -16.072983 2.634915
-11.330135 3.6888812 15.282508 -22.396778 12.911084
-13.701559 -14.228541 13.438067 -0.526983 -1.5809491
-11.857118 17.65393 -5.0063386 4.479356 11.593626
11.593626 5.533322 -22.133287 -2.3714237 -2.3714237
3.4253898 -10.012677 0.2634915 0. -17.39044
-1.8444406 -37.415794 -5.26983 -6.8507795 11.857118
28.193592 3.4253898 -23.714235 4.215864 5.0063386
3.6888812 -4.215864 -10.53966 1.053966 -12.384101
-7.641254 -0.79047453 6.3237963 5.533322 1.3174576
3.4253898 -1.8444406 4.7428474 -4.215864 ]]]
Some of those numbers are almost exactly the same (to ~3 decimal places), while others are quite far off. Is that expected? Guessing it depends a lot on the particular model and the sorts of ops it has? I will continue to investigate this.
In any case, please feel free to close this if you are satisfied that the TFL_VAR/READ support is solved based on the above notebook outputs. Thanks for your work on this!
Okay, I think what might have been happening there was that I was feeding np.ones
into the model, which is very "unnatural" data given that it's expecting an audio wave form, so the output is more chaotic than normal. When I feed audio into it, the output numbers are within ~20% of one another:
If I understand the fix correctly it will remove some VAR_OPS and fill others with zeroes. This will fix your shape issue but it wont work with the architecture of the model. As you can see below from the snippet of the encoder, the model is using the VAR_OPS to keep track of the previous inference:
VAR_HANDLE should handle the mapping to the variable. ASSIGN_VARIABLE should copy the input to the variable. READ_VARIABLE should copy the variable to the output.
Without these the output of the model will be different from the original tensorflow model.
@dramaticlama Looks like you're correct!
https://github.com/google/lyra/issues/99#issuecomment-1318936519
Yes, these models are stateful indeed. There is possibility to export stateless models that return a state pointer after every call that needs to be copied over to the input for of the next call. That forces the user to handle the state manually, but that is what you need for this use-case. I am no longer at Google, so I don't have access to the exporting pipeline, but maybe there is a way to convert from one model to the other?
@fatcat-z I'm wondering if ONNX supports this sort of thing? If not, maybe the converter could just turn variable nodes into input/output nodes, or something like that?
@dramaticlama Looks like you're correct!
Yes, these models are stateful indeed. There is possibility to export stateless models that return a state pointer after every call that needs to be copied over to the input for of the next call. That forces the user to handle the state manually, but that is what you need for this use-case. I am no longer at Google, so I don't have access to the exporting pipeline, but maybe there is a way to convert from one model to the other?
@fatcat-z I'm wondering if ONNX supports this sort of thing? If not, maybe the converter could just turn variable nodes into input/output nodes, or something like that?
When I prepared this PR, I was using another version of soundstream_encoder.tflite you shared. For that one, the PR works accidentally.
@dramaticlama 's thoughts are correct so we probably finally could not find out a way to convert such model successfully.
@fatcat-z Ah I see. Can you see any possible path forward here to convert this into a stateless ONNX model? For example, could we turn AssignVariable
nodes into output nodes and ReadVariable
nodes into input nodes, and then logging the VarHandle
pairings to the user as warnings during the conversion process? (see @dramaticlama's screenshot above)
So the user would have to manually pipe the output variables back into the input during the next inference according to the details of the logged VarHandle
s.
(I guess, ideally, all the "stateful" variables could be "bundled" into a single output/input so the model would just have to pipe that one extra output back into the one extra input.)
New Operator
The
TFL_VAR_HANDLE
andTFL_READ_VARIABLE
operators are used in Lyra's "soundstream encoder" model and "lyragan" model. I don't know the specifics of what these operators are for other than what is implied by their name, and I probably can't contribute them. Here are the two models:And here's a minimal reproduction of the errors:
https://colab.research.google.com/gist/josephrocca/5af909bd240264cdecd4598903be8dfa
Note that the above Colab uses this patch (the only change is here) of tf2onnx to hackily avoid this problem.
Here's the full outputs of the conversion commands:
soundstream_encoder.tflite
``` /usr/lib/python3.7/runpy.py:125: RuntimeWarning: 'tf2onnx.convert' found in sys.modules after import of package 'tf2onnx', but prior to execution of 'tf2onnx.convert'; this may result in unpredictable behaviour warn(RuntimeWarning(msg)) 2022-10-12 07:37:31,601 - INFO - tf2onnx: inputs: None 2022-10-12 07:37:31,601 - INFO - tf2onnx: outputs: None 2022-10-12 07:37:32,743 - INFO - tf2onnx.tfonnx: Using tensorflow=2.9.2, onnx=1.12.0, tf2onnx=1.12.0/7e0144 2022-10-12 07:37:32,743 - INFO - tf2onnx.tfonnx: Using opsetlyragan.tflite
``` /usr/lib/python3.7/runpy.py:125: RuntimeWarning: 'tf2onnx.convert' found in sys.modules after import of package 'tf2onnx', but prior to execution of 'tf2onnx.convert'; this may result in unpredictable behaviour warn(RuntimeWarning(msg)) 2022-10-12 08:00:14,602 - INFO - tf2onnx: inputs: None 2022-10-12 08:00:14,603 - INFO - tf2onnx: outputs: None 2022-10-12 08:00:15,636 - INFO - tf2onnx.tfonnx: Using tensorflow=2.9.2, onnx=1.12.0, tf2onnx=1.12.0/7e0144 2022-10-12 08:00:15,636 - INFO - tf2onnx.tfonnx: Using opset