onnx / onnx-tensorflow

Tensorflow Backend for ONNX
Other
1.26k stars 298 forks source link

KeyError with PyTorch either Pad or GlobalAvgPooling #1050

Open bywbilly opened 1 year ago

bywbilly commented 1 year ago

Hi,

I met an issue with converting from Pytorch -> onnx -> TF.

During the prepare step, I have the following error message:

Traceback (most recent call last): │· File "convert_to_tf.py", line 35, in │· tf_rep.export_graph("tf.pb") │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/onnx_tf/backend_rep.py", line 144, in export_graph │· self.signatures)) │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 1264, in get_concrete_function │· concrete = self._get_concrete_function_garbage_collected(*args, *kwargs) │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 1244, in _get_concrete_function_garbage_collected │· self._initialize(args, kwargs, add_initializers_to=initializers) │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 786, in _initialize │· args, kwds)) │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2983, in _get_concrete_function_internal_garbage_collecte│· d │· graphfunction, = self._maybe_define_function(args, kwargs) │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3292, in _maybe_define_function │· graph_function = self._create_graph_function(args, kwargs) │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3140, in _create_graph_function │· capture_by_value=self._capture_by_value), │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 1161, in func_graph_from_py_func │· func_outputs = python_func(*func_args, func_kwargs) │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 677, in wrapped_fn │· out = weak_wrapped_fn().wrapped(*args, *kwds) │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3831, in bound_method_wrapper │· return wrapped_fn(args, kwargs) │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 1147, in autograph_handler │· raise e.ag_error_metadata.to_exception(e) │· tensorflow.python.autograph.pyct.error_utils.KeyError: in user code: │· │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/onnx_tf/backend_tf_module.py", line 99, in call │· output_ops = self.backend._onnx_node_to_tensorflow_op(onnx_node, │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/onnx_tf/backend.py", line 347, in _onnx_node_to_tensorflow_op │· return handler.handle(node, tensor_dict=tensor_dict, strict=strict) │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/onnx_tf/handlers/handler.py", line 59, in handle * │· return ver_handle(node, *kwargs) │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/onnx_tf/handlers/backend/pad.py", line 98, in version_13 │· return cls._common(node, *kwargs) │· File "/home/fs01/yb263/miniconda3/envs/bywbilly/lib/python3.7/site-packages/onnx_tf/handlers/backend/pad.py", line 76, in _common │· constant_values = tensor_dict[node.inputs[2]] if len( │· │· KeyError: ''

Through some search, I found maybe the pad operator should have 4 parameters but now it only has 2 in onnx (from issue https://github.com/onnx/onnx-tensorflow/issues/21). I am wondering how can I fix this or any suggestion is welcome.

My Pytorch model is a simple UNet with a global average pooling layer at the final.

Here are the two possible error places:

self.global_pool = nn.AdaptiveAvgPool2d((1, 1))

or

diffY = x2.size()[2] - x1.size()[2] │· diffX = x2.size()[3] - x1.size()[3] │· │· x1 = F.pad(x1, [diffX // 2, diffX - diffX // 2, │· diffY // 2, diffY - diffY // 2])

TF Version: 2.8.0 onnx-tf Version: 1.10.0 onnx version: 1.12.0 python version: 3.7.10

Thanks!

PINTO0309 commented 1 year ago

As stated in the README of this repository, it seems to have been deprecated. image

So I have started to create and test another tool by myself. If you don't mind, could you share the ONNX file with me? I would like to test your ONNX file.

Here is the tool I am creating. https://github.com/PINTO0309/onnx2tf

bywbilly commented 1 year ago

@PINTO0309 Thanks for your reply. It will be great if you can try my onnx file. Here it is: https://drive.google.com/file/d/1WmzBLW2HExUmEq-auEZ1bc1YAPooOUgo/view?usp=share_link

PINTO0309 commented 1 year ago

Nothing went wrong and the conversion appears to be correct. I am only verifying that the conversion completes successfully and have not yet verified whether the accuracy is degraded.

$ python -C "import tensorflow as tf;tf.__version__"
2.10.0
$ python -V
Python 3.8.10

$ pip install -U onnx \
&& pip install -U nvidia-pyindex \
&& pip install -U onnx-graphsurgeon \
&& pip install -U onnxsim \
&& pip install -U simple_onnx_processing_tools \
&& pip install -U onnx2tf

$ onnx2tf -V
1.1.46

or

$ docker run --rm -it \
-v `pwd`:/workdir \
-w /workdir \
ghcr.io/pinto0309/onnx2tf:1.1.46

$ onnx2tf -i tf.onnx

Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                   ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Cast              │ 31             │ 0                │
│ Concat            │ 14             │ 4                │
│ Constant          │ 78             │ 0                │
│ ConstantOfShape   │ 4              │ 0                │
│ Conv              │ 18             │ 18               │
│ ConvTranspose     │ 4              │ 4                │
│ Div               │ 12             │ 0                │
│ Equal             │ 1              │ 0                │
│ Gather            │ 10             │ 0                │
│ Gemm              │ 1              │ 1                │
│ GlobalAveragePool │ 1              │ 1                │
│ Identity          │ 2              │ 0                │
│ If                │ 1              │ 0                │
│ MaxPool           │ 4              │ 4                │
│ Pad               │ 4              │ 0                │
│ Relu              │ 18             │ 18               │
│ Reshape           │ 8              │ 0                │
│ Shape             │ 10             │ 0                │
│ Slice             │ 4              │ 0                │
│ Squeeze           │ 1              │ 1                │
│ Sub               │ 15             │ 0                │
│ Transpose         │ 4              │ 0                │
│ Unsqueeze         │ 24             │ 0                │
│ Model Size        │ 118.6MiB       │ 118.6MiB         │
└───────────────────┴────────────────┴──────────────────┘

Model optimizing complete!

Automatic generation of each OP name started ========================================
Automatic generation of each OP name complete!

Model loaded ========================================================================

Model convertion started ============================================================
INFO: input_op_name: input.1 shape: [1, 92, 64, 64] dtype: float32

INFO: onnx_op_type: Conv onnx_op_name: Conv_2
INFO:  input_name.1: input.1 shape: [1, 92, 64, 64] dtype: float32
INFO:  input_name.2: onnx::Conv_453 shape: [64, 92, 3, 3] dtype: <class 'numpy.float32'>
INFO:  input_name.3: onnx::Conv_454 shape: [64] dtype: <class 'numpy.float32'>
INFO:  output_name.1: input.4 shape: [1, 64, 64, 64] dtype: float32
INFO: tf_op_type: convolution_v2
INFO:  input.1.input: name: input.1 shape: (1, 64, 64, 92) dtype: <dtype: 'float32'> 
INFO:  input.2.weights: shape: (3, 3, 92, 64) dtype: float32 
INFO:  input.3.bias: shape: (64,) dtype: float32 
INFO:  input.4.strides: val: [1, 1] 
INFO:  input.5.dilations: val: [1, 1] 
INFO:  input.6.padding: val: SAME 
INFO:  input.7.group: val: 1 
INFO:  output.1.output: name: tf.math.add/Add:0 shape: (1, 64, 64, 64) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Relu onnx_op_name: Relu_3
INFO:  input_name.1: input.4 shape: [1, 64, 64, 64] dtype: float32
INFO:  output_name.1: onnx::Conv_123 shape: [1, 64, 64, 64] dtype: float32
INFO: tf_op_type: relu
INFO:  input.1.features: name: tf.math.add/Add:0 shape: (1, 64, 64, 64) dtype: <dtype: 'float32'> 
INFO:  output.1.output: name: tf.nn.relu/Relu:0 shape: (1, 64, 64, 64) dtype: <dtype: 'float32'> 
:
saved_model output started ==========================================================
saved_model output complete!
Estimated count of arithmetic ops: 6.443 G  ops, equivalently 3.221 G  MACs
Float32 tflite output complete!
Estimated count of arithmetic ops: 6.443 G  ops, equivalently 3.221 G  MACs
Float16 tflite output complete!

image

image