[silero_vad] Question about Pad - Githubissues

PINTO0309 / onnx2tf

Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.

MIT License

706 stars 73 forks source link

[silero_vad] Question about Pad #42

Closed bayleef1 closed 1 year ago

bayleef1 commented 1 year ago

Issue Type

Others

onnx2tf version number

1.2.4

Download URL for ONNX

https://github.com/snakers4/silero-vad/blob/v3.1/files/silero_vad.onnx

Parameter Replacement JSON

{}

Description

Purpose: Personal development
What: Convert error occured Running script: onnx2tf -i silero_vad.onnx

INFO: onnx_op_type: Pad INFO: input shape: [1, 1, 1, 512] ... INFO: output shape: [1, 1, 1, 768] INFO: tf_op_type: Pad INFO: input shape: [1, 1, 1, 512] ... INFO: output shape: [1, 1, 257, 512]

ValueError: Exception encountered when calling layer "tf.squeeze" Can not squeeze dim[3], expected a dimension of 1, got 512

So why the Pad's behavior is not the same during conversion and what should I do?

How: I've tried Parameter replacement but the error still occured. So I think it is related to the dimensional decompression after Reshape.
Why: Because I want tflite.

PINTO0309 commented 1 year ago

The error cannot be reproduced. Please post accurate information that can be reproduced. Also, LSTM is not supported at this time. https://github.com/PINTO0309/onnx2tf#supported-layers

INFO: onnx_op_type: Pad onnx_op_name: Pad_27
INFO:  input_name.1: 129 shape: [1, 1, 1, 'unk__22'] dtype: float32
INFO:  input_name.2: 151 shape: [8] dtype: <class 'numpy.int64'>
INFO:  output_name.1: 152 shape: [1, 1, 1, 'unk__23'] dtype: float32
INFO: tf_op_type: Pad
INFO:  input.1.x: name: tf.reshape_1/Reshape:0 shape: (1, 1, 1, None) dtype: <dtype: 'float32'> 
INFO:  input.2.paddings: shape: (4, 2) dtype: int64 
INFO:  input.3.constant_value: val: 0 
INFO:  input.4.mode: val: reflect 
INFO:  input.5.tensor_rank: val: 4 
INFO:  output.1.output: name: tf.compat.v1.pad/Pad_27:0 shape: (1, 1, 257, None) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Squeeze onnx_op_name: sng_Squeeze_0
INFO:  input_name.1: 152 shape: [1, 1, 1, 'unk__23'] dtype: float32
INFO:  input_name.2: 890 shape: (1,) dtype: <class 'numpy.int64'>
INFO:  output_name.1: 747 shape: [1, 1, 'unk__23'] dtype: float32
INFO: tf_op_type: squeeze_v2
INFO:  input.1.input: name: tf.compat.v1.pad/Pad_27:0 shape: (1, 1, 257, None) dtype: <dtype: 'float32'> 
INFO:  input.2.axis: val: [3] 
INFO:  output.1.output: name: tf.compat.v1.squeeze/sng_Squeeze_0:0 shape: (1, 1, 257) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Conv onnx_op_name: Conv_37
INFO:  input_name.1: 747 shape: [1, 1, 'unk__23'] dtype: float32
INFO:  input_name.2: feature_extractor.forward_basis_buffer shape: [258, 1, 256] dtype: <class 'numpy.float32'>
INFO:  output_name.1: 162 shape: [1, 258, 'unk__25'] dtype: float32
ERROR: The trace log is below.
Traceback (most recent call last):
  File "/home/xxxxx/git/onnx2tf/onnx2tf/utils/common_functions.py", line 262, in print_wrapper_func
    result = func(*args, **kwargs)
  File "/home/xxxxx/git/onnx2tf/onnx2tf/utils/common_functions.py", line 324, in inverted_operation_enable_disable_wrapper_func
    result = func(*args, **kwargs)
  File "/home/xxxxx/git/onnx2tf/onnx2tf/ops/Conv.py", line 306, in make_node
    tf.nn.convolution(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/usr/local/lib/python3.8/dist-packages/keras/layers/core/tf_op_layer.py", line 119, in handle
    return TFOpLambda(op)(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
ValueError: Exception encountered when calling layer "tf.nn.convolution" (type TFOpLambda).

Depth of output (258) is not a multiple of the number of groups (257) for '{{node tf.nn.convolution/convolution}} = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], explicit_paddings=[], padding="VALID", strides=[1, 1, 64, 1], use_cudnn_on_gpu=true](tf.nn.convolution/convolution/ExpandDims, tf.nn.convolution/convolution/ExpandDims_1)' with input shapes: [1,1,1,257], [1,256,1,258].

Call arguments received by layer "tf.nn.convolution" (type TFOpLambda):
  • input=tf.Tensor(shape=(1, 1, 257), dtype=float32)
  • filters=array([[[ 0.0000000e+00,  0.0000000e+00,  0.0000000e+00, ...,
          0.0000000e+00,  0.0000000e+00,  0.0000000e+00]],

       [[ 1.5059065e-04,  1.5054530e-04,  1.5040927e-04, ...,
         -7.3891333e-06, -3.6956797e-06,  0.0000000e+00]],

       [[ 6.0227187e-04,  6.0154643e-04,  5.9937179e-04, ...,
          5.9032966e-05,  2.9552080e-05,  0.0000000e+00]],

       ...,

       [[ 1.3547717e-03,  1.3511009e-03,  1.3401083e-03, ...,
          1.9878628e-04,  9.9663193e-05,  0.0000000e+00]],

       [[ 6.0227187e-04,  6.0154643e-04,  5.9937179e-04, ...,
         -5.9032966e-05, -2.9552080e-05,  0.0000000e+00]],

       [[ 1.5059065e-04,  1.5054530e-04,  1.5040927e-04, ...,
          7.3891333e-06,  3.6956797e-06,  0.0000000e+00]]], dtype=float32)
  • strides=['64']
  • padding='VALID'
  • data_format=None
  • dilations=['1']
  • name=None
ERROR: Read this and deal with it. https://github.com/PINTO0309/onnx2tf#parameter-replacement
ERROR: Alternatively, if the input OP has a dynamic dimension, use the -b or -ois option to rewrite it to a static shape and try again.
ERROR: If the input OP of ONNX before conversion is NHWC, use the -kt option.

PINTO0309 commented 1 year ago

If I were to try, I would first try to separate the model into two parts, using the red line as a boundary, then transform the upper part with onnx2tf and the lower part with onnx-tensorflow. However, since the model including LSTM has a special structure called control flow, converting it to TFLite will be quite difficult due to TensorFlow's specifications.

bayleef1 commented 1 year ago

Thanks for your kind reply. The error can be reproduced by running script onnx2tf -i silero_vad.onnx --overwrite_input_shape "input:1,512" Also, when I converted it with onnx-tensorflow, another error occured. See https://github.com/onnx/onnx-tensorflow/issues/1042#issuecomment-1344031092

PINTO0309 commented 1 year ago

I have reflected the modifications and am running regression tests in GitHub Actions here, which will take about an hour as we test all 51 different models that have been able to convert in the past. https://github.com/PINTO0309/onnx2tf/actions/runs/3663720781/jobs/6193680127

onnx2tf -i silero_vad.onnx --overwrite_input_shape "input:1,512"

INFO: onnx_op_type: Relu onnx_op_name: Relu_579
INFO:  input_name.1: 887 shape: [1, 64, 3] dtype: float32
INFO:  output_name.1: 724 shape: [1, 64, 3] dtype: float32
INFO: tf_op_type: relu
INFO:  input.1.features: name: tf.math.add_61/Add:0 shape: (1, 3, 64) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.nn.relu_15/Relu:0 shape: (1, 3, 64) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Transpose onnx_op_name: Transpose_580
INFO:  input_name.1: 724 shape: [1, 64, 3] dtype: float32
INFO:  output_name.1: 725 shape: [3, 1, 64] dtype: float32
INFO: tf_op_type: transpose_v2
INFO:  input.1.a: name: tf.nn.relu_15/Relu:0 shape: (1, 3, 64) dtype: <dtype: 'float32'> 
INFO:  input.2.perm: val: [1, 0, 2] 
INFO:  output.1.output: name: tf.compat.v1.transpose_62/transpose:0 shape: (3, 1, 64) dtype: <dtype: 'float32'> 
ERROR: LSTM OP is not yet implemented.

PINTO0309 commented 1 year ago

https://github.com/PINTO0309/onnx2tf/releases/tag/1.2.5

PINTO0309 commented 1 year ago

1. Upper half conversion

onnxsim silero_vad.onnx silero_vad.onnx --overwrite-input-shape "input:1,512"

sne4onnx \
--input_onnx_file_path silero_vad.onnx \
--output_onnx_file_path silero_vad_up_to_LSTM.onnx \
--input_op_names input \
--output_op_names 725

silero_vad_up_to_LSTM.onnx

onnx2tf -i silero_vad_up_to_LSTM.onnx

 tf.compat.v1.transpose_62 (TFOpLambda)       (3, 1, 64)                     0               ['tf.nn.relu_15[0][0]']                        

============================================================================================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
____________________________________________________________________________________________________________________________________________

saved_model output started ==========================================================
WARNING: This model contains GroupConvolution and is automatically optimized for TFLite, but is not output because saved_model does not support GroupConvolution. If saved_model is needed, specify --disable_group_convolution to retransform the model.
WARNING:absl:Please consider providing the trackable_obj argument in the from_concrete_functions. Providing without the trackable_obj argument is deprecated and it will use the deprecated conversion path.
Estimated count of arithmetic ops: 1.683 M  ops, equivalently 0.841 M  MACs
Float32 tflite output complete!
Estimated count of arithmetic ops: 1.683 M  ops, equivalently 0.841 M  MACs
Float16 tflite output complete!

silero_vad_up_to_LSTM tflite model_float32.tflite.zip

or

onnx2tf -i silero_vad.onnx -onimc 725

2. Extract LSTM in the lower half

sne4onnx \
--input_onnx_file_path silero_vad.onnx \
--output_onnx_file_path silero_vad_LSTM_and_beyond.onnx \
--input_op_names 725 c0 h0 \
--output_op_names output hn cn

silero_vad_LSTM_and_beyond.onnx silero_vad_LSTM_and_beyond.onnx.zip