Open gurudatta-patil opened 4 months ago
I will keep notes on the material to research again when I have enough time to work on it.
INFO: 50 / 1719
INFO: onnx_op_type: Expand onnx_op_name: wa/xvector/block1/tdnnd1/cam_layer/Expand
INFO: input_name.1: wa/xvector/block1/tdnnd1/cam_layer/Unsqueeze_output_0 shape: [1, 128, 'unk__77', 1] dtype: float32
INFO: input_name.2: wa/xvector/block1/tdnnd1/cam_layer/Where_output_0 shape: [4] dtype: int64
INFO: output_name.1: wa/xvector/block1/tdnnd1/cam_layer/Expand_output_0 shape: ['unk__80', 128, 'unk__83', 'unk__86'] dtype: float32
INFO: tf_op_type: Expand
INFO: input.1.input_tensor: name: tf.expand_dims_4/ExpandDims:0 shape: (1, 2, 1, 128) dtype: <dtype: 'float32'>
INFO: input.2.input_tensor_shape: name: tf.where/SelectV2:0 shape: (4,) dtype: <dtype: 'int64'>
INFO: output.1.output: name: tf.math.multiply_9/Mul:0 shape: (None, 2, None, 128) dtype: <dtype: 'float32'>
Expand
bug)
tf.ones
when undefined dimensions are present.
https://github.com/PINTO0309/onnx2tf/blob/3ce052df092f253ab531ac065d988870d388d7e2/onnx2tf/ops/Expand.py#L110-L120Thank you! Can I follow any particular steps in a different way to solve it, as of now? Shape: [1,-1,80] LSTM Model, Final conversion: int8 tflite model
There is a JSON behavior correction function, but it is difficult to understand and takes a very long time to comprehend.
I'm concentrating on other tasks for a while, so if you're in a hurry, try these. The conversion success rate is said to be 100%.
It turned out to be an AveragePool1D
problem, not an Expand
problem. This is a rather tricky problem. This issue has nothing to do with LSTM or INT8 quantization, but rather with specification differences between frameworks for the Pooling process.
Unfortunately, this AveragePool
is not compatible with TensorFlow's AveragePool
.
The padding size is calculated by a rather complicated logic and is forced to conform to TensorFlow, so I have to investigate how to reduce the padding size to zero. Essentially, the output tensor of AveragePool
must be TF: [1, 100, 128] ONNX: [1, 128, 100]
.
We did try it by using ai-edge-torch, but saw that if failed to modify the code, but after some meddling we could convert the code. But it still does not transform the average pooling layer correctly.
Thanks for sharing your valuable experience. This is quite a difficult issue.
I would add debugging resources. | Dynamic | Static128 | Static1 |
---|---|---|---|
avgpool1d_dynamic.onnx.zip |
avgpool1d_static.onnx.zip |
avgpool1d_static1.onnx.zip |
onnx2tf -i avgpool1d_static1.onnx -cotof
INFO: validation_conditions: np.allclose(onnx_outputs, tf_outputs, rtol=0.0, atol=0.0001, equal_nan=True)
INFO: onnx_output_name: wa/xvector/block1/tdnnd1/cam_layer/AveragePool_output_0 tf_output_name: tf.compat.v1.squeeze/Squeeze:0 shape: (1, 128, 1) dtype: float32 validate_result: Unmatched max_abs_error: 0.9900000095367432
WIP: https://github.com/PINTO0309/onnx2tf/compare/main...fix_undef_expand
BatchNormalization 1D
TensorShape([1, None, 128])
TensorShape([1, 128, 128])
I have fixed and released the critical problems except for AveragePool
, but AveragePool (with ceil_mode=1)
with dynamic tensor as input is extremely difficult to fix due to compatibility issues with TensorFlow.
The problem is that the error was not occurring in the AveragePool
where the conversion error should have occurred, and the latest onnx2tf should now generate a conversion error in the AveragePool
. This is because of the difficulty in calculating the ExtraPadding
needed to resolve the differences between PyTorch and TensorFlow's Pooling specifications.
```
INFO: 39 / 1464
INFO: onnx_op_type: AveragePool onnx_op_name: wa/xvector/block1/tdnnd1/cam_layer/AveragePool
INFO: input_name.1: wa/xvector/block1/tdnnd1/nonlinear2/relu/Relu_output_0 shape: [1, 128, 'unk__71'] dtype: float32
INFO: output_name.1: wa/xvector/block1/tdnnd1/cam_layer/AveragePool_output_0 shape: [1, 128, 'unk__77'] dtype: float32
ERROR: The trace log is below.
Traceback (most recent call last):
File "/home/xxxxx/git/onnx2tf/onnx2tf/utils/common_functions.py", line 312, in print_wrapper_func
result = func(*args, **kwargs)
File "/home/xxxxx/git/onnx2tf/onnx2tf/utils/common_functions.py", line 385, in inverted_operation_enable_disable_wrapper_func
result = func(*args, **kwargs)
File "/home/xxxxx/git/onnx2tf/onnx2tf/utils/common_functions.py", line 55, in get_replacement_parameter_wrapper_func
func(*args, **kwargs)
File "/home/xxxxx/git/onnx2tf/onnx2tf/ops/AveragePool.py", line 171, in make_node
output_spatial_shape = [
File "/home/xxxxx/git/onnx2tf/onnx2tf/ops/AveragePool.py", line 172, in <listcomp>
func((i + pb + pe - d * (k - 1) - 1) / s + 1)
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
ERROR: input_onnx_file_path: ../cam++_vin.onnx
ERROR: onnx_op_name: wa/xvector/block1/tdnnd1/cam_layer/AveragePool
ERROR: Read this and deal with it. https://github.com/PINTO0309/onnx2tf#parameter-replacement
ERROR: Alternatively, if the input OP has a dynamic dimension, use the -b or -ois option to rewrite it to a static shape and try again.
ERROR: If the input OP of ONNX before conversion is NHWC or an irregular channel arrangement other than NCHW, use the -kt or -kat option.
ERROR: Also, for models that include NonMaxSuppression in the post-processing, try the -onwdt option.
```
Issue Type
Others
OS
Linux
onnx2tf version number
1.25.6
onnx version number
1.16.1
onnxruntime version number
1.18.1
onnxsim (onnx_simplifier) version number
0.4.33
tensorflow version number
2.17.0
Download URL for ONNX
https://github.com/gurudatta-patil/ML-Campp/blob/main/cam%2B%2B_vin.onnx
Parameter Replacement JSON
Description
Research
Command: onnx2tf -i cam++_vin.onnx -osd -coion
Input size: [1,-1,80] name: input tensor: float32[1,time_frames,80]
I also tried a few other commands including passing a npy file as input. I am trying to get a int8 output for the model.
I am trying to get this model into lightweight format with minimal quantization error to deploy on embedded device.