[DDNM] Support additional parameters to onnxsim

thekevinscott commented 1 year ago

Issue Type

Feature Request

onnx2tf version number

1.5.44

onnx version number

1.13.0

tensorflow version number

Version: 2.11.0

Download URL for ONNX

Not relevant

Parameter Replacement JSON

Not relevant

Description

Right now, a single parameter - --overwrite-input-shape - is forwarded to onnxsim.

It would be helpful to provide a mechanism to pass arbitrary parameters to onnxsim.

For a particular conversion I'm attempting, I see the following warning:

However, I have no way to forward --no-large-tensor to onnxsim.

PINTO0309 commented 1 year ago

I am well aware of the advantages and disadvantages of that option, as I was the one who inspired the addition of that feature to onnxsim. https://github.com/daquexian/onnx-simplifier/issues/178

If possible, would you be willing to share a sample ONNX file for testing?

PINTO0309 commented 1 year ago

Released with additional functionality. I am adding regression tests to my GitHub Actions for each feature revision, so if you can share the onnx before closing the issue, I would be very happy to share it.

https://github.com/PINTO0309/onnx2tf/actions/runs/4109150793/jobs/7090651297

https://github.com/PINTO0309/onnx2tf/wiki/model_status

https://github.com/PINTO0309/onnx2tf/releases/tag/1.6.0

thekevinscott commented 1 year ago

Thanks @PINTO0309 !

I'll share the ONNX model later today, as I'm using mobile data at the moment and the model is fairly big.

I returned to troubleshooting conversion today, and I'm unfortunately unable to reproduce the issue I reported originally. However, I'm now running into a new issue, and I have no idea what I changed!

(If you'd prefer I open a separate issue I'm happy to do that).

The converter is reporting that the output shape is None, which I assume to mean the output is dynamic (a batch size?) but I'm not sure how to specify that to the converter.

Model optimizing started ============================================================
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                       ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Add                   │ 141            │ 141              │
│ Concat                │ 43             │ 43               │
│ Conv                  │ 120            │ 120              │
│ Cos                   │ 1              │ 1                │
│ Gather                │ 24             │ 24               │
│ Gemm                  │ 34             │ 34               │
│ InstanceNormalization │ 71             │ 71               │
│ MatMul                │ 12             │ 12               │
│ Mul                   │ 151            │ 151              │
│ Reshape               │ 166            │ 166              │
│ Resize                │ 5              │ 5                │
│ Shape                 │ 95             │ 95               │
│ Sigmoid               │ 67             │ 67               │
│ Sin                   │ 1              │ 1                │
│ Softmax               │ 6              │ 6                │
│ Transpose             │ 12             │ 12               │
│ Unsqueeze             │ 143            │ 143              │
│ Model Size            │ 433.8MiB       │ 433.8MiB         │
└───────────────────────┴────────────────┴──────────────────┘

Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                       ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Add                   │ 141            │ 141              │
│ Concat                │ 43             │ 43               │
│ Conv                  │ 120            │ 120              │
│ Cos                   │ 1              │ 1                │
│ Gather                │ 24             │ 24               │
│ Gemm                  │ 34             │ 34               │
│ InstanceNormalization │ 71             │ 71               │
│ MatMul                │ 12             │ 12               │
│ Mul                   │ 151            │ 151              │
│ Reshape               │ 166            │ 166              │
│ Resize                │ 5              │ 5                │
│ Shape                 │ 95             │ 95               │
│ Sigmoid               │ 67             │ 67               │
│ Sin                   │ 1              │ 1                │
│ Softmax               │ 6              │ 6                │
│ Transpose             │ 12             │ 12               │
│ Unsqueeze             │ 143            │ 143              │
│ Model Size            │ 433.8MiB       │ 433.8MiB         │
└───────────────────────┴────────────────┴──────────────────┘

Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                       ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Add                   │ 141            │ 141              │
│ Concat                │ 43             │ 43               │
│ Conv                  │ 120            │ 120              │
│ Cos                   │ 1              │ 1                │
│ Gather                │ 24             │ 24               │
│ Gemm                  │ 34             │ 34               │
│ InstanceNormalization │ 71             │ 71               │
│ MatMul                │ 12             │ 12               │
│ Mul                   │ 151            │ 151              │
│ Reshape               │ 166            │ 166              │
│ Resize                │ 5              │ 5                │
│ Shape                 │ 95             │ 95               │
│ Sigmoid               │ 67             │ 67               │
│ Sin                   │ 1              │ 1                │
│ Softmax               │ 6              │ 6                │
│ Transpose             │ 12             │ 12               │
│ Unsqueeze             │ 143            │ 143              │
│ Model Size            │ 433.8MiB       │ 433.8MiB         │
└───────────────────────┴────────────────┴──────────────────┘

Model optimizing complete!

Automatic generation of each OP name started ========================================
Automatic generation of each OP name complete!

Model loaded ========================================================================

Model convertion started ============================================================
INFO: input_op_name: input1 shape: ['batch_size', 3, 256, 256] dtype: float32
INFO: input_op_name: input2 shape: [1] dtype: float32

INFO: onnx_op_type: Unsqueeze onnx_op_name: wa/Unsqueeze
INFO:  input_name.1: input2 shape: [1] dtype: float32
INFO:  output_name.1: /Unsqueeze_output_0 shape: [1, 1] dtype: float32
INFO: tf_op_type: reshape
INFO:  input.1.tensor: name: input2 shape: (1,) dtype: <dtype: 'float32'> 
INFO:  input.2.shape: val: [1, 1] 
INFO:  output.1.output: name: tf.reshape_21/Reshape:0 shape: (1, 1) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Conv onnx_op_name: wa/conv_in/Conv
INFO:  input_name.1: input1 shape: ['batch_size', 3, 256, 256] dtype: float32
INFO:  input_name.2: conv_in.weight shape: [128, 3, 3, 3] dtype: <class 'numpy.float32'>
INFO:  input_name.3: conv_in.bias shape: [128] dtype: <class 'numpy.float32'>
INFO:  output_name.1: /conv_in/Conv_output_0 shape: ['batch_size', 128, 256, 256] dtype: float32
INFO: tf_op_type: convolution_v2
INFO:  input.1.input: name: input1 shape: (None, 256, 256, 3) dtype: <dtype: 'float32'> 
INFO:  input.2.weights: shape: (3, 3, 3, 128) dtype: <dtype: 'float32'> 
INFO:  input.3.bias: shape: (128,) dtype: float32 
INFO:  input.4.strides: val: [1, 1] 
INFO:  input.5.dilations: val: [1, 1] 
INFO:  input.6.padding: val: SAME 
INFO:  input.7.group: val: 1 
INFO:  output.1.output: name: tf.math.add_14/Add:0 shape: (None, 256, 256, 128) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Mul onnx_op_name: wa/Mul
INFO:  input_name.1: /Unsqueeze_output_0 shape: [1, 1] dtype: float32
INFO:  input_name.2: /Constant_output_0 shape: [1, 64] dtype: <class 'numpy.float32'>
INFO:  output_name.1: /Mul_output_0 shape: [1, 64] dtype: float32
INFO: tf_op_type: multiply
INFO:  input.1.x: name: tf.reshape_21/Reshape:0 shape: (1, 1) dtype: <dtype: 'float32'> 
INFO:  input.2.y: shape: (1, 64) dtype: float32 
INFO:  output.1.output: name: tf.math.multiply_77/Mul:0 shape: (1, 64) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Reshape onnx_op_name: wa/block.0/norm1/Reshape
INFO:  input_name.1: /conv_in/Conv_output_0 shape: ['batch_size', 128, 256, 256] dtype: float32
INFO:  input_name.2: /block.0/norm1/Constant_output_0 shape: [3] dtype: <class 'numpy.int64'>
INFO:  output_name.1: /block.0/norm1/Reshape_output_0 shape: ['batch_size', 32, 262144] dtype: float32
INFO: tf_op_type: reshape
INFO:  input.1.tensor: name: tf.compat.v1.transpose_25/transpose:0 shape: (None, 128, 256, 256) dtype: <dtype: 'float32'> 
INFO:  input.2.shape: val: [-1, 32, 262144] 
INFO:  output.1.output: name: tf.reshape_22/Reshape:0 shape: (None, 32, 262144) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Shape onnx_op_name: wa/block.0/norm1/Shape
INFO:  input_name.1: /conv_in/Conv_output_0 shape: ['batch_size', 128, 256, 256] dtype: float32
INFO:  output_name.1: /block.0/norm1/Shape_output_0 shape: [4] dtype: int64
INFO: tf_op_type: shape_v2
INFO:  input.1.x: name: tf.math.add_14/Add:0 shape: (None, 256, 256, 128) dtype: <dtype: 'float32'> 
INFO:  input.2.out_type: name: int64 shape: () 
INFO:  output.1.output: name: tf.compat.v1.shape_7/wa/block.0/norm1/Shape:0 shape: (4,) dtype: <dtype: 'int64'> 

INFO: onnx_op_type: Sin onnx_op_name: wa/Sin
INFO:  input_name.1: /Mul_output_0 shape: [1, 64] dtype: float32
INFO:  output_name.1: /Sin_output_0 shape: [1, 64] dtype: float32
INFO: tf_op_type: sin
INFO:  input.1.x: name: tf.math.multiply_77/Mul:0 shape: (1, 64) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.math.sin_7/Sin:0 shape: (1, 64) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Cos onnx_op_name: wa/Cos
INFO:  input_name.1: /Mul_output_0 shape: [1, 64] dtype: float32
INFO:  output_name.1: /Cos_output_0 shape: [1, 64] dtype: float32
INFO: tf_op_type: cos
INFO:  input.1.x: name: tf.math.multiply_77/Mul:0 shape: (1, 64) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.math.cos_7/Cos:0 shape: (1, 64) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: InstanceNormalization onnx_op_name: wa/block.0/norm1/InstanceNormalization
INFO:  input_name.1: /block.0/norm1/Reshape_output_0 shape: ['batch_size', 32, 262144] dtype: float32
INFO:  input_name.2: /block.0/norm1/Constant_1_output_0 shape: [32] dtype: <class 'numpy.float32'>
INFO:  input_name.3: /block.0/norm1/Constant_2_output_0 shape: [32] dtype: <class 'numpy.float32'>
INFO:  output_name.1: /block.0/norm1/InstanceNormalization_output_0 shape: ['batch_size', 32, 262144] dtype: float32
INFO: tf_op_type: batch_normalization
INFO:  input.1.x: name: tf.reshape_22/Reshape:0 shape: (None, 32, 262144) dtype: <dtype: 'float32'> 
INFO:  input.2.mean: name: tf.math.reduce_mean_14/Mean:0 shape: (None, 1, 262144) dtype: <dtype: 'float32'> 
INFO:  input.3.variance: name: tf.math.reduce_mean_15/Mean:0 shape: (None, 1, 262144) dtype: <dtype: 'float32'> 
INFO:  input.4.offset: shape: (1, 32, 1) dtype: <dtype: 'float32'> 
INFO:  input.5.scale: shape: (1, 32, 1) dtype: <dtype: 'float32'> 
INFO:  input.6.variance_epsilon: val: 9.999999974752427e-07 
INFO:  output.1.output: name: tf.__operators__.add_29/AddV2:0 shape: (None, 32, 262144) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Concat onnx_op_name: wa/Concat
INFO:  input_name.1: /Sin_output_0 shape: [1, 64] dtype: float32
INFO:  input_name.2: /Cos_output_0 shape: [1, 64] dtype: float32
INFO:  output_name.1: /Concat_output_0 shape: [1, 128] dtype: float32
INFO: tf_op_type: concat
INFO:  input.1.input0: name: tf.math.sin_7/Sin:0 shape: (1, 64) dtype: <dtype: 'float32'> 
INFO:  input.2.input1: name: tf.math.cos_7/Cos:0 shape: (1, 64) dtype: <dtype: 'float32'> 
INFO:  input.3.axis: val: 1 
INFO:  output.1.output: name: tf.concat_7/concat:0 shape: (1, 128) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Reshape onnx_op_name: wa/block.0/norm1/Reshape_1
INFO:  input_name.1: /block.0/norm1/InstanceNormalization_output_0 shape: ['batch_size', 32, 262144] dtype: float32
INFO:  input_name.2: /block.0/norm1/Shape_output_0 shape: [4] dtype: int64
INFO:  output_name.1: /block.0/norm1/Reshape_1_output_0 shape: None dtype: float32
INFO: tf_op_type: reshape
INFO:  input.1.tensor: name: tf.compat.v1.transpose_26/transpose:0 shape: (None, 32, 262144) dtype: <dtype: 'float32'> 
INFO:  input.2.shape: name: tf.compat.v1.shape_7/wa/block.0/norm1/Shape:0 shape: (4,) dtype: <dtype: 'int64'> 
INFO:  output.1.output: name: tf.compat.v1.transpose_27/transpose:0 shape: (None, None, None, None) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Gemm onnx_op_name: wa/dense.0/Gemm
INFO:  input_name.1: /Concat_output_0 shape: [1, 128] dtype: float32
INFO:  input_name.2: temb.dense.0.weight shape: [512, 128] dtype: <class 'numpy.float32'>
INFO:  input_name.3: temb.dense.0.bias shape: [512] dtype: <class 'numpy.float32'>
INFO:  output_name.1: /dense.0/Gemm_output_0 shape: [1, 512] dtype: float32
INFO: tf_op_type: matmul
INFO:  input.1.x: name: Placeholder:0 shape: (1, 128) dtype: <dtype: 'float32'> 
INFO:  input.2.y: shape: (128, 512) dtype: <dtype: 'float32'> 
INFO:  input.3.z: shape: (512,) dtype: <dtype: 'float32'> 
INFO:  input.4.alpha: val: 1.0 
INFO:  input.5.beta: val: 1.0 
INFO:  output.1.output: name: Placeholder:0 shape: (1, 512) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Mul onnx_op_name: wa/block.0/norm1/Mul
INFO:  input_name.1: /block.0/norm1/Reshape_1_output_0 shape: None dtype: float32
INFO:  input_name.2: onnx::Mul_2067 shape: [128, 1, 1] dtype: <class 'numpy.float32'>
INFO:  output_name.1: /block.0/norm1/Mul_output_0 shape: None dtype: float32
INFO: tf_op_type: multiply
INFO:  input.1.x: name: tf.compat.v1.transpose_27/transpose:0 shape: (None, None, None, None) dtype: <dtype: 'float32'> 
INFO:  input.2.y: shape: (1, 1, 1, 128) dtype: float32 
INFO:  output.1.output: name: tf.math.multiply_82/Mul:0 shape: (None, None, None, 128) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Sigmoid onnx_op_name: wa/Sigmoid
INFO:  input_name.1: /dense.0/Gemm_output_0 shape: [1, 512] dtype: float32
INFO:  output_name.1: /Sigmoid_output_0 shape: [1, 512] dtype: float32
INFO: tf_op_type: sigmoid
INFO:  input.1.x: name: Placeholder:0 shape: (1, 512) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.math.sigmoid_14/Sigmoid:0 shape: (1, 512) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Add onnx_op_name: wa/block.0/norm1/Add
INFO:  input_name.1: /block.0/norm1/Mul_output_0 shape: None dtype: float32
INFO:  input_name.2: onnx::Add_2068 shape: [128, 1, 1] dtype: <class 'numpy.float32'>
INFO:  output_name.1: /block.0/norm1/Add_output_0 shape: None dtype: float32
INFO: tf_op_type: add
INFO:  input.1.x: name: tf.math.multiply_82/Mul:0 shape: (None, None, None, 128) dtype: <dtype: 'float32'> 
INFO:  input.2.y: shape: (1, 1, 1, 128) dtype: float32 
INFO:  output.1.output: name: tf.math.add_15/Add:0 shape: (None, None, None, 128) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Mul onnx_op_name: wa/Mul_1
INFO:  input_name.1: /dense.0/Gemm_output_0 shape: [1, 512] dtype: float32
INFO:  input_name.2: /Sigmoid_output_0 shape: [1, 512] dtype: float32
INFO:  output_name.1: /Mul_1_output_0 shape: [1, 512] dtype: float32
INFO: tf_op_type: multiply
INFO:  input.1.x: name: Placeholder:0 shape: (1, 512) dtype: <dtype: 'float32'> 
INFO:  input.2.y: name: tf.expand_dims_7/ExpandDims:0 shape: (1, 512) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.math.multiply_84/Mul:0 shape: (1, 512) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Sigmoid onnx_op_name: wa/block.0/Sigmoid
INFO:  input_name.1: /block.0/norm1/Add_output_0 shape: None dtype: float32
INFO:  output_name.1: /block.0/Sigmoid_output_0 shape: None dtype: float32
INFO: tf_op_type: sigmoid
INFO:  input.1.x: name: tf.math.add_15/Add:0 shape: (None, None, None, 128) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.math.sigmoid_15/Sigmoid:0 shape: (None, None, None, 128) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Gemm onnx_op_name: wa/dense.1/Gemm
INFO:  input_name.1: /Mul_1_output_0 shape: [1, 512] dtype: float32
INFO:  input_name.2: temb.dense.1.weight shape: [512, 512] dtype: <class 'numpy.float32'>
INFO:  input_name.3: temb.dense.1.bias shape: [512] dtype: <class 'numpy.float32'>
INFO:  output_name.1: /dense.1/Gemm_output_0 shape: [1, 512] dtype: float32
INFO: tf_op_type: matmul
INFO:  input.1.x: name: Placeholder:0 shape: (1, 512) dtype: <dtype: 'float32'> 
INFO:  input.2.y: shape: (512, 512) dtype: <dtype: 'float32'> 
INFO:  input.3.z: shape: (512,) dtype: <dtype: 'float32'> 
INFO:  input.4.alpha: val: 1.0 
INFO:  input.5.beta: val: 1.0 
INFO:  output.1.output: name: Placeholder:0 shape: (1, 512) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Mul onnx_op_name: wa/block.0/Mul
INFO:  input_name.1: /block.0/norm1/Add_output_0 shape: None dtype: float32
INFO:  input_name.2: /block.0/Sigmoid_output_0 shape: None dtype: float32
INFO:  output_name.1: /block.0/Mul_output_0 shape: None dtype: float32
INFO: tf_op_type: multiply
INFO:  input.1.x: name: tf.math.add_15/Add:0 shape: (None, None, None, 128) dtype: <dtype: 'float32'> 
INFO:  input.2.y: name: tf.math.sigmoid_15/Sigmoid:0 shape: (None, None, None, 128) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.math.multiply_87/Mul:0 shape: (None, None, None, 128) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Conv onnx_op_name: wa/block.0/conv1/Conv
INFO:  input_name.1: /block.0/Mul_output_0 shape: None dtype: float32
INFO:  input_name.2: down.0.block.0.conv1.weight shape: [128, 128, 3, 3] dtype: <class 'numpy.float32'>
INFO:  input_name.3: down.0.block.0.conv1.bias shape: [128] dtype: <class 'numpy.float32'>
INFO:  output_name.1: /block.0/conv1/Conv_output_0 shape: None dtype: float32
ERROR: The trace log is below.
ERROR: Read this and deal with it. https://github.com/PINTO0309/onnx2tf#parameter-replacement
ERROR: Alternatively, if the input OP has a dynamic dimension, use the -b or -ois option to rewrite it to a static shape and try again.
ERROR: If the input OP of ONNX before conversion is NHWC or an irregular channel arrangement other than NCHW, use the -kt or -kat option.
ERROR: Also, for models that include NonMaxSuppression in the post-processing, try the -onwdt option.

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/onnx2tf/utils/common_functions.py", line 275, in print_wrapper_func
    result = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/onnx2tf/utils/common_functions.py", line 343, in inverted_operation_enable_disable_wrapper_func
    result = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/onnx2tf/ops/Conv.py", line 111, in make_node
    and graph_node.inputs[0].shape[2:] == output_tensor_shape[2:]:
TypeError: 'NoneType' object is not subscriptable

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File /usr/local/lib/python3.10/dist-packages/onnx2tf/utils/common_functions.py:275, in print_node_info.<locals>.print_wrapper_func(*args, **kwargs)
    274 try:
--> 275     result = func(*args, **kwargs)
    277     if not non_verbose:

File /usr/local/lib/python3.10/dist-packages/onnx2tf/utils/common_functions.py:343, in inverted_operation_enable_disable.<locals>.inverted_operation_enable_disable_wrapper_func(*args, **kwargs)
    341 @wraps(func)
    342 def inverted_operation_enable_disable_wrapper_func(*args, **kwargs):
--> 343     result = func(*args, **kwargs)
    344     """
    345     The output_shape_trans stores the result of determining
    346     whether the final output shape of the connected OP differs between ONNX and TensorFlow.
   (...)
    351     False: No transposition
    352     """

File /usr/local/lib/python3.10/dist-packages/onnx2tf/ops/Conv.py:111, in make_node(graph_node, tf_layers_dict, **kwargs)
    109 if auto_pad == 'NOTSET':
    110     if input_tensor_rank >=2 \
--> 111         and graph_node.inputs[0].shape[2:] == output_tensor_shape[2:]:
    112         pad_mode = "SAME"

TypeError: 'NoneType' object is not subscriptable

During handling of the above exception, another exception occurred:

SystemExit                                Traceback (most recent call last)
    [... skipping hidden 1 frame]

Cell In[86], line 152
    149     return tf.saved_model.load(ONNX2TF_TENSORFLOW_FROM_ONNX_NAME)
--> 152 onnx2f_tf_model = make_onnx2tf_tensorflow_if_not_exists(
    153 
    154             output_signaturedefs=True,
    155             output_h5=False,
    156             output_weights=True,
    157             output_integer_quantized_tflite=False,
    158     verbose=True,
    159             # non_verbose=True,
    160             # keep_ncw_or_nchw_or_ncdhw_input_names=['input1'],
    161             # **kwargs
    162 )

Cell In[86], line 134, in make_onnx2tf_tensorflow_if_not_exists(verbose, **kwargs)
    132 print('Making ONNX2TF tensorflow model')
--> 134 converted_model = convert(
    135     input_onnx_file_path=ONNX_FOR_ONNX2F_MODEL_NAME,
    136     output_folder_path=ONNX2TF_TENSORFLOW_FROM_ONNX_NAME,
    137     # output_signaturedefs=True,
    138     # output_h5=False,
    139     # output_weights=True,
    140     # output_integer_quantized_tflite=False,
    141     non_verbose=verbose is False,
    142     # keep_ncw_or_nchw_or_ncdhw_input_names=[INPUT1_NAME],
    143     # output_nms_with_dynamic_tensor=True,
    144     # batch_size=1,
    145     **kwargs
    146 )
    147 print(f'Made ONNX2TF Tensorflow model at {ONNX2TF_TENSORFLOW_FROM_ONNX_NAME}')        

File /usr/local/lib/python3.10/dist-packages/onnx2tf/onnx2tf.py:731, in convert(input_onnx_file_path, onnx_graph, output_folder_path, output_signaturedefs, output_h5, output_weights, output_integer_quantized_tflite, quant_type, quant_calib_input_op_name_np_data_path, input_output_quant_dtype, not_use_onnxsim, not_use_opname_auto_generate, batch_size, overwrite_input_shape, output_nms_with_dynamic_tensor, keep_ncw_or_nchw_or_ncdhw_input_names, keep_nwc_or_nhwc_or_ndhwc_input_names, keep_shape_absolutely_input_names, output_names_to_interrupt_model_conversion, disable_group_convolution, enable_batchmatmul_unfold, disable_suppression_flextranspose, number_of_dimensions_after_flextranspose_compression, optimization_for_gpu_delegate, replace_argmax_to_reducemax_and_indicies_is_int64, replace_argmax_to_reducemax_and_indicies_is_float32, replace_argmax_to_fused_argmax_and_indicies_is_int64, replace_argmax_to_fused_argmax_and_indicies_is_float32, fused_argmax_scale_ratio, replace_to_pseudo_operators, param_replacement_file, check_gpu_delegate_compatibility, check_onnx_tf_outputs_elementwise_close, check_onnx_tf_outputs_elementwise_close_full, check_onnx_tf_outputs_sample_data_normalization, check_onnx_tf_outputs_elementwise_close_rtol, check_onnx_tf_outputs_elementwise_close_atol, mvn_epsilon, non_verbose)
    730         graph_node.name = re.sub('^/', 'wa/', graph_node.name)
--> 731     op.make_node(
    732         graph_node=graph_node,
    733         tf_layers_dict=tf_layers_dict,
    734         **additional_parameters,
    735     )
    737 # List "optype"="Input"

File /usr/local/lib/python3.10/dist-packages/onnx2tf/utils/common_functions.py:336, in print_node_info.<locals>.print_wrapper_func(*args, **kwargs)
    331 print(
    332     f'{Color.RED}ERROR:{Color.RESET} ' +
    333     f'Also, for models that include NonMaxSuppression in the post-processing, ' +
    334     f'try the -onwdt option.'
    335 )
--> 336 sys.exit(1)

SystemExit: 1

During handling of the above exception, another exception occurred:

AssertionError                            Traceback (most recent call last)
    [... skipping hidden 1 frame]

File /usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py:2047, in InteractiveShell.showtraceback(self, exc_tuple, filename, tb_offset, exception_only, running_compiled_code)
   2044 if exception_only:
   2045     stb = ['An exception has occurred, use %tb to see '
   2046            'the full traceback.\n']
-> 2047     stb.extend(self.InteractiveTB.get_exception_only(etype,
   2048                                                      value))
   2049 else:
   2050     try:
   2051         # Exception classes can customise their traceback - we
   2052         # use this in IPython.parallel for exceptions occurring
   2053         # in the engines. This should return a list of strings.

File /usr/local/lib/python3.10/dist-packages/IPython/core/ultratb.py:585, in ListTB.get_exception_only(self, etype, value)
    577 def get_exception_only(self, etype, value):
    578     """Only print the exception type and message, without a traceback.
    579 
    580     Parameters
   (...)
    583     value : exception value
    584     """
--> 585     return ListTB.structured_traceback(self, etype, value)

File /usr/local/lib/python3.10/dist-packages/IPython/core/ultratb.py:452, in ListTB.structured_traceback(self, etype, evalue, etb, tb_offset, context)
    449     chained_exc_ids.add(id(exception[1]))
    450     chained_exceptions_tb_offset = 0
    451     out_list = (
--> 452         self.structured_traceback(
    453             etype, evalue, (etb, chained_exc_ids),
    454             chained_exceptions_tb_offset, context)
    455         + chained_exception_message
    456         + out_list)
    458 return out_list

File /usr/local/lib/python3.10/dist-packages/IPython/core/ultratb.py:1118, in AutoFormattedTB.structured_traceback(self, etype, value, tb, tb_offset, number_of_lines_of_context)
   1116 else:
   1117     self.tb = tb
-> 1118 return FormattedTB.structured_traceback(
   1119     self, etype, value, tb, tb_offset, number_of_lines_of_context)

File /usr/local/lib/python3.10/dist-packages/IPython/core/ultratb.py:1012, in FormattedTB.structured_traceback(self, etype, value, tb, tb_offset, number_of_lines_of_context)
   1009 mode = self.mode
   1010 if mode in self.verbose_modes:
   1011     # Verbose modes need a full traceback
-> 1012     return VerboseTB.structured_traceback(
   1013         self, etype, value, tb, tb_offset, number_of_lines_of_context
   1014     )
   1015 elif mode == 'Minimal':
   1016     return ListTB.get_exception_only(self, etype, value)

File /usr/local/lib/python3.10/dist-packages/IPython/core/ultratb.py:865, in VerboseTB.structured_traceback(self, etype, evalue, etb, tb_offset, number_of_lines_of_context)
    856 def structured_traceback(
    857     self,
    858     etype: type,
   (...)
    862     number_of_lines_of_context: int = 5,
    863 ):
    864     """Return a nice text document describing the traceback."""
--> 865     formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context,
    866                                                            tb_offset)
    868     colors = self.Colors  # just a shorthand + quicker name lookup
    869     colorsnormal = colors.Normal  # used a lot

File /usr/local/lib/python3.10/dist-packages/IPython/core/ultratb.py:799, in VerboseTB.format_exception_as_a_whole(self, etype, evalue, etb, number_of_lines_of_context, tb_offset)
    796 assert isinstance(tb_offset, int)
    797 head = self.prepare_header(etype, self.long_header)
    798 records = (
--> 799     self.get_records(etb, number_of_lines_of_context, tb_offset) if etb else []
    800 )
    802 frames = []
    803 skipped = 0

File /usr/local/lib/python3.10/dist-packages/IPython/core/ultratb.py:854, in VerboseTB.get_records(self, etb, number_of_lines_of_context, tb_offset)
    848     formatter = None
    849 options = stack_data.Options(
    850     before=before,
    851     after=after,
    852     pygments_formatter=formatter,
    853 )
--> 854 return list(stack_data.FrameInfo.stack_data(etb, options=options))[tb_offset:]

File /usr/local/lib/python3.10/dist-packages/stack_data/core.py:578, in FrameInfo.stack_data(cls, frame_or_tb, options, collapse_repeated_frames)
    562 @classmethod
    563 def stack_data(
    564         cls,
   (...)
    568         collapse_repeated_frames: bool = True
    569 ) -> Iterator[Union['FrameInfo', RepeatedFrames]]:
    570     """
    571     An iterator of FrameInfo and RepeatedFrames objects representing
    572     a full traceback or stack. Similar consecutive frames are collapsed into RepeatedFrames
   (...)
    576     and optionally an Options object to configure.
    577     """
--> 578     stack = list(iter_stack(frame_or_tb))
    580     # Reverse the stack from a frame so that it's in the same order
    581     # as the order from a traceback, which is the order of a printed
    582     # traceback when read top to bottom (most recent call last)
    583     if is_frame(frame_or_tb):

File /usr/local/lib/python3.10/dist-packages/stack_data/utils.py:97, in iter_stack(frame_or_tb)
     95 while frame_or_tb:
     96     yield frame_or_tb
---> 97     if is_frame(frame_or_tb):
     98         frame_or_tb = frame_or_tb.f_back
     99     else:

File /usr/local/lib/python3.10/dist-packages/stack_data/utils.py:90, in is_frame(frame_or_tb)
     89 def is_frame(frame_or_tb: Union[FrameType, TracebackType]) -> bool:
---> 90     assert_(isinstance(frame_or_tb, (types.FrameType, types.TracebackType)))
     91     return isinstance(frame_or_tb, (types.FrameType,))

File /usr/local/lib/python3.10/dist-packages/stack_data/utils.py:176, in assert_(condition, error)
    174 if isinstance(error, str):
    175     error = AssertionError(error)
--> 176 raise error

AssertionError:

PINTO0309 commented 1 year ago

If the model is very large, it is better to run onnxsim at least 5 times in a row before using onnx2tf.

The number of OPs is only 1092, so the model does not appear to be very complex. As a rule of thumb, a model with 25,000 OP should run onnxsim approximately 10 times.

onnxsim

onnxsim xxx.onnx xxx.onnx --overwrite-input-shape "input1:1,3,256,256" "input2:1"
onnxsim xxx.onnx xxx.onnx --overwrite-input-shape "input1:1,3,256,256" "input2:1"
onnxsim xxx.onnx xxx.onnx --overwrite-input-shape "input1:1,3,256,256" "input2:1"
onnxsim xxx.onnx xxx.onnx --overwrite-input-shape "input1:1,3,256,256" "input2:1"
onnxsim xxx.onnx xxx.onnx --overwrite-input-shape "input1:1,3,256,256" "input2:1"

shell

onnx2tf -i xxx.onnx -ois input1:1,3,256,256 -osd

or

onnx2tf -i xxx.onnx -ois input1:1,3,256,256 input2:1 -osd

python script

overwrite_input_shape = ['input1:1,3,256,256', 'input2:1']

Incidentally, the model you are trying to convert probably has none of the effects of --no-large-tensor.

thekevinscott commented 1 year ago

Fantastic, really appreciate all the help @PINTO0309 !

The model converted successfully with your advice. I've now got yet another question (again, please let me know if you want to make a new issue, as I think you've successfully addressed the original issue I opened here).

I've got the following files:

model.onnx - the original ONNX file, converted from the original Pytorch model here
manual-tensorflow-from-onnx.zip - converted from onnx to tensorflow via onnx_tf - originally this has not been working but somehow now it is working? I don't know why
onnx2tf-tensorflow-from-onnx.zip - converted via onnx2tf.

(Here's the folder with the above three files).

Here are the image outputs (I'm running it against an image of Marie Curie using the "inpainting" task (a mask is drawn over the mouth and the image is reconstructed):

`model.onnx`

`manual-tensorflow-from-onnx.zip`

`onnx2tf-tensorflow-from-onnx.zip`

It looks like the version converted via onnx2tf is doing optimizations that are resulting in a degradation of the model. Do you have any insights into what these optimizations might be or how to tweak them as to result in a not-so-degraded image?

PINTO0309 commented 1 year ago

First, you need to try the command I posted in the TensorFlow Forum. -cotof and -cotoa option. You can see which operations are causing conversion errors.

onnx2tf -i ddnm.onnx -ois input1:1,3,256,256 input2:1 -cotof -cotoa 1e-1

The tool misjudged the correct transposed dimension just before the OP, which is marked Unmatched in yellow letters. This tool attempts to optimize the model to the limit by employing a completely different algorithm than onnx-tensorflow, but in exchange it introduces a certain probability of transposition errors. Models generated by onnx2tf run 20% to 30% faster than models generated by onnx-tensorflow.

Be sure to read this issue before rushing to get answers. When Onnx Matmul inputs have different dimension #133

The location of the transposition error can be identified from the log, so the JSON file can be used to compensate for the transposition error. Converting your model correctly is easy, but I don't have time right now. It is now 1:00 a.m. in Japan and I need to get some sleep to prepare for work tomorrow.

It may take some time to understand how to correct the tool's behavior, but if you want to convert a DDNM model in a hurry, read this tutorial and try to transpose it so that the first Unmatched opration is Matches. It is a process that requires patience, but we believe that correcting the transposition approximately 10 times will yield the correct results. https://github.com/PINTO0309/onnx2tf#parameter-replacement

samples https://github.com/PINTO0309/onnx2tf/blob/main/json_samples/replace_MobileBERT.json or https://github.com/PINTO0309/onnx2tf/blob/main/json_samples/replace_MobileFormer-e9.json

The probability of transposition errors in the Transformer model is quite high in the current version. Although it is expected to take considerable time to modify the system, the following issue is expected to significantly reduce the error rate. Implementation of strict mode #145

PINTO0309 commented 1 year ago

I am having trouble resolving the arithmetic errors in InstanceNormalization, but the final output agreed up to 1e-2.

https://github.com/PINTO0309/onnx2tf/blob/main/json_samples/replace_ddnm.json

wget https://github.com/PINTO0309/onnx2tf/blob/main/json_samples/replace_ddnm.json

onnx2tf -i ddnm.onnx -prf replace_ddnm.json -ois input1:1,3,256,256 input2:1 -cotof -cotoa 1e-2 -osd

converted from onnx to tensorflow via onnx_tf - originally this has not been working but somehow now it is working? I don't know why

This is because the optimization by onnxsim was not properly performed by you. The onnx-tensorflow also fails in many cases to transform models with undefined dimensions.

onnxsim xxx.onnx xxx.onnx --overwrite-input-shape "input1:1,3,256,256" "input2:1"
onnxsim xxx.onnx xxx.onnx --overwrite-input-shape "input1:1,3,256,256" "input2:1"
onnxsim xxx.onnx xxx.onnx --overwrite-input-shape "input1:1,3,256,256" "input2:1"
onnxsim xxx.onnx xxx.onnx --overwrite-input-shape "input1:1,3,256,256" "input2:1"
onnxsim xxx.onnx xxx.onnx --overwrite-input-shape "input1:1,3,256,256" "input2:1"

thekevinscott commented 1 year ago

Thanks for all the help @PINTO0309 .

For the command you shared, I'm getting:

Model convertion started ============================================================
INFO: input_op_name: input1 shape: [1, 3, 256, 256] dtype: float32
ERROR: The trace log is below.
Traceback (most recent call last):
  File "/home/tower/anaconda3/lib/python3.9/site-packages/onnx2tf/utils/common_functions.py", line 275, in print_wrapper_func
    result = func(*args, **kwargs)
  File "/home/tower/anaconda3/lib/python3.9/site-packages/onnx2tf/ops/Input.py", line 81, in make_node
    if graph_input.shape != tf.TensorShape(None) and len(graph_input.shape) in [3, 4, 5] \
  File "/home/tower/.local/lib/python3.9/site-packages/tensorflow/python/framework/tensor_shape.py", line 1294, in __ne__
    raise ValueError("The inequality of unknown TensorShapes is undefined.")
ValueError: The inequality of unknown TensorShapes is undefined.
ERROR: Read this and deal with it. https://github.com/PINTO0309/onnx2tf#parameter-replacement
ERROR: Alternatively, if the input OP has a dynamic dimension, use the -b or -ois option to rewrite it to a static shape and try again.
ERROR: If the input OP of ONNX before conversion is NHWC or an irregular channel arrangement other than NCHW, use the -kt or -kat option.
ERROR: Also, for models that include NonMaxSuppression in the post-processing, try the -onwdt option.

I also tried with -b 1 to rewrite it to a static shape. I believe the input is NCHW, so I think it must be something with the parameter replacement.

I'll keep working on it, I'm still getting up to speed on your tool - there's quite a lot to read up on.

thekevinscott commented 1 year ago

How did you generate the .json file? Is that something you're writing by hand? I see examples in json_samples but I don't see any documentation on how you're creating that file.

PINTO0309 commented 1 year ago

I also tried with -b 1 to rewrite it to a static shape. I believe the input is NCHW, so I think it must be something with the parameter replacement.

I have converted your model several times since last night. The figure below is the result of a fairly rigorous accuracy check, so the final output is shown as a lot of Unmatched, but it is not a conversion error. v1.6.0

This log shows a successful conversion one minute ago. https://s3.ap-northeast-2.wasabisys.com/temp-models/onnx2tf_175/ddnm.onnx

onnx2tf -i ddnm.onnx -prf replace_ddnm.json -ois input1:1,3,256,256 input2:1 -cotof -cotoa 1e-4

How did you generate the .json file? Is that something you're writing by hand? I see examples in json_samples but I don't see any documentation on how you're creating that file.

The JSON is written by hand, trying many conversions and identifying the points where accuracy errors occur one by one.

See the issue below for how I identified the problem areas and how I wrote the JSON. I have created a number of repositories in the past and written many READMEs, but many users did not read them no matter how much information I included. Therefore, in this repository, I decided not to include a lot of information in the README. [MobileFormer] Converted model outputs values mismatch with original ones. #105

PINTO0309 commented 1 year ago

I have found a bug in some parameter replacement logic that I will now fix. I noticed that the -osd option breaks the TensorFlow model.

PINTO0309 commented 1 year ago

Thanks to you I was able to notice the existence of a fatal bug. I immediately fixed the bug and released it with the parameter replacement function working correctly.

https://github.com/PINTO0309/onnx2tf/releases/tag/1.6.1

I will share the URL for downloading the saved_model that I tried to convert. I don't know how much the accuracy error affects the inference, but give it a try if you are interested.

saved_model + tflite https://s3.ap-northeast-2.wasabisys.com/temp-models/onnx2tf_175/saved_model.zip

thekevinscott commented 1 year ago

Thanks for the quick responses and bug fixes.

I downloaded your saved_model and can confirm it works perfectly.

However, I'm still running into the same issue when running the tool locally, and I don't know why:

> onnx2tf -V
1.6.3

> onnx2tf -i ddnm.onnx -prf replace_ddnm.json -ois input1:1,3,256,256 input2:1 -cotof -cotoa 1e-4 -osd

...

Model optimizing complete!

Automatic generation of each OP name started ========================================
Automatic generation of each OP name complete!

Model loaded ========================================================================

Model convertion started ============================================================
INFO: input_op_name: input1 shape: [1, 3, 256, 256] dtype: float32
ERROR: The trace log is below.
Traceback (most recent call last):
  File "/home/tower/anaconda3/lib/python3.9/site-packages/onnx2tf/utils/common_functions.py", line 278, in print_wrapper_func
    result = func(*args, **kwargs)
  File "/home/tower/anaconda3/lib/python3.9/site-packages/onnx2tf/ops/Input.py", line 81, in make_node
    if graph_input.shape != tf.TensorShape(None) and len(graph_input.shape) in [3, 4, 5] \
  File "/home/tower/.local/lib/python3.9/site-packages/tensorflow/python/framework/tensor_shape.py", line 1294, in __ne__
    raise ValueError("The inequality of unknown TensorShapes is undefined.")
ValueError: The inequality of unknown TensorShapes is undefined.
ERROR: Read this and deal with it. https://github.com/PINTO0309/onnx2tf#parameter-replacement
ERROR: Alternatively, if the input OP has a dynamic dimension, use the -b or -ois option to rewrite it to a static shape and try again.
ERROR: If the input OP of ONNX before conversion is NHWC or an irregular channel arrangement other than NCHW, use the -kt or -kat option.
ERROR: Also, for models that include NonMaxSuppression in the post-processing, try the -onwdt option.

I'm not sure why your command works but mine fails. Maybe it's a version mismatch; I pip installed via pypi, maybe I should try pip installing directly from this repo?

PINTO0309 commented 1 year ago

At the moment, I have no idea why the error is occurring, but from the logs, it looks like you are using Anaconda, so why not try using Docker?

thekevinscott commented 1 year ago

Sounds like you're onto something. I tried within a docker container and it appears to be running (at least, it's gotten past that error). So it must be a system thing. I'll keep troubleshooting and report what I find.

thekevinscott commented 1 year ago

Some updates from me:

I'm now on 1.6.7. I was able to use onnx2tf to convert the DDNM model successfully. Some comparisons:

Model converted via onnx_tf, duration to process an image: 21.09 seconds
Model converted via onnx2tf, duration to process an image: 11.5 seconds

That's a huge speed up! Kudos.

I do see two new issues that aren't blocking me, but might be worth reporting:

It appears that tflite conversions are now happening by default even though I haven't specified a flag:

> onnx2tf -i ddnm.onnx -prf replace_ddnm.json -ois input1:1,3,256,256 input2:1 -cotof -cotoa 1e-4 -osd

....

saved_model output started ==========================================================
saved_model output complete!
WARNING:absl:Please consider providing the trackable_obj argument in the from_concrete_functions. Providing without the trackable_obj argument is deprecated and it will use the deprecated conversion path.
Float32 tflite output complete!
Float16 tflite output complete!

Might be nice to only emit these files if specified by the user (I'm not seeing anything in the documentation about turning them off)

During validation of the model I'm seeing:

ONNX and TF output value validation started =========================================
INFO: validation_conditions: np.allclose(onnx_outputs, tf_outputs, rtol=0.0, atol=0.0001, equal_nan=True)
Traceback (most recent call last):
  File "/usr/local/bin/onnx2tf", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/onnx2tf/onnx2tf.py", line 1828, in main
    model = convert(
  File "/usr/local/lib/python3.10/dist-packages/onnx2tf/onnx2tf.py", line 1298, in convert
    check_results = onnx_tf_tensor_validation(
  File "/usr/local/lib/python3.10/dist-packages/onnx2tf/utils/common_functions.py", line 2992, in onnx_tf_tensor_validation
    onnx_tensor_shape = onnx_tensor.shape
AttributeError: 'NoneType' object has no attribute 'shape'

It's not clear to me how to debug this output error.

Neither of these issues is a show stopper for me, and the saved model export is working beautifully 👌

thekevinscott commented 1 year ago

Might also be worth comparing the output images. I didn't do a PSNR or SSIM comparison, but visually they look identical:

onnx_tf

onnx2tf

PINTO0309 commented 1 year ago

Very happy to hear the good news. :)

I am currently working on a pull request that will significantly change the behavior of the tool to address the issues you have presented. It will be difficult and quite time consuming.

https://github.com/PINTO0309/onnx2tf/pull/184

I would appreciate it if you could close this issue once the modification to make the parameter file unnecessary would take a lot of time and is a topic of functional improvement that is very different from the initial issue of this issue.

It was a very meaningful issue. Thank you.

thekevinscott commented 1 year ago

Fantastic, I am looking forward to it! 😄

PINTO0309 / onnx2tf