Closed PetiteFleurPF closed 4 months ago
I don't see the correlation between the attached ONNX file and the image you posted.
I can't send you the final model, the one I shared with you is similar and reproduces the same error. All the images I've pasted are from the final template.
For example, after use the model that I shared with you with the pipeline I obtained that: and that: And so you can reproduce the error.
I still don't understand what your concern is. The ONNX DepthwiseConv2d -> Clip -> Conv2d -> Reshape -> Transpose -> Reshape section you illustrate is a rather redundant and useless combination of multiple OPs. I have no idea what the problem is with changing the shape of [1,24,1,1] to [1,6,4]. Your ONNX file is too much useless processing.
This tool automatically eliminates all unnecessary processing.
OK - I'm in the process of removing the post-processing from the model so that we can start from the same observation, which is that the bounding boxes produced by the model are purely random - this can be seen in the various sizes and locations. In short, the converted version seems to have lost all the information obtained during training. I'll write again when the model is uploaded
And I try to understand why. That is why I made a hypothesis about the modification of the nodes
onnx2tf -i model.onnx \
-onimc /head/regression_head/Concat_12_output_0 /Softmax_output_0 \
-cotof
The myriad uses of NonZero
in post-processing severely inhibit the normal model transformation behavior of onnx2tf because the output is non-deterministic.
If you want to implement NMS, bounding box filtering by NonZero
+ TopK
+ If
is quite redundant and quite wasteful to use for inference in TFLite
I still not updated the model ^^ Don't focus on the post-processing - the problem, my problem is not coming from there :) you will see (I am deleting them now (the post-process) to recreate a model without it to let you do a correct investigation) give me 5minutes.
onnx2tf -i model.onnx \ -onimc /head/regression_head/Concat_12_output_0 /Softmax_output_0 \ -cotof
and here you reproduced what I said :) and after that, it's the post-processing, but at this moment, the output produced by the node "concat" is incorrect.
onnx2tf -i model.onnx \ -onimc /head/regression_head/Concat_12_output_0 /Softmax_output_0 \ -cotof
ok it seems you already did it. So just do a test of detection and you will see that the result from the model ONNX and TFLITE are absolute not the same. And the results from TFLITE are made of random bounding boxes.
I just finished to delete the post-processing and extracted the data so:
With this new version - I obtained that before apply any post-processing step:
With the version converted:
So you can see than the results are not the same.
Thus, I still don't understand why they claim that the output values are different, even though the results of comparing all the elements are identical. If you say that you are comparing outputs with all post-processing really removed, then they must match.
This is because the -cotof
option compares the ONNX output with the TensorFlow output with near-perfect precision for all elements one by one.
You always seem to paste the output as an image, but that doesn't give me any of the information I need. Does the shape of the output tensor in ONNX exactly match the shape of the output tensor in TensorFlow? If the channel positions do not match exactly, there is absolutely no point in comparing element-by-element values.
When I added my post-processing steps into the ONNX model before the conversion, without NMS: with ONNX - the bounding boxes made sense in terms of localization but for TFLITE no.
Organize. I still don't understand the claim, so this will be my last reply.
/head/regression_head/Concat_12_output_0
and /Softmax_output_0
all match at the element level.[3234, 4]
came from.INFO: onnx_output_name: /head/regression_head/Concat_12_output_0 tf_output_name: tf.concat/concat:0 shape: (1, 3234, 4) dtype: float32 validate_result: Matches
INFO: onnx_output_name: /head/classification_head/Concat_12_output_0 tf_output_name: tf.concat_1/concat:0 shape: (1, 3234, 91) dtype: float32 validate_result: Matches
INFO: onnx_output_name: /Softmax_output_0 tf_output_name: tf.nn.softmax//Softmax:0 shape: (1, 3234, 91) dtype: float32 validate_result: Matches
The solution was "Whenever all input data are set to 1". With others TFLITE models that we used, it wasn't the case. I tested and all works perfectly now. Many many .... thanks ! :D
Issue Type
Others
OS
Linux
onnx2tf version number
1.23.0
onnx version number
1.15.0
onnxruntime version number
1.17.1
onnxsim (onnx_simplifier) version number
0.4.33
tensorflow version number
2.16.1
Download URL for ONNX without post-processing
https://we.tl/t-Z6RrkZRZxf (TFLITE version) https://we.tl/t-PH8hLCyGu9 (ONNX version)
Parameter Replacement JSON
Description
Product development - impact: could be improved inference time to obtain TFLITE version.
The results from the TFLITE version of our model look like random bounding box coordinates (in size and location) when you plot them.
I investigated the differences between both results - they appeared before the post-processing part. So I tried to observe the differences, one of which was the observation of node deletion during conversion. During the conversion process, some nodes disappeared: I tested every option, even the one that stopped the simplification and wasn't the solution. It is very surprising because these nodes existed during a moment because they appeared during process steps and validation:
To succeed in the conversion of the converted ONNX model.
You don't need more.
PS: the previous problem was solved by only deleting the NMS part and not everything into the post-processing process but you helped us a lot to underline the correct way to do things. So thank you.