PINTO0309 / onnx2tf

Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.
MIT License
662 stars 65 forks source link

[ONNX to TFLite] Cannot reshape from slice to unsqueeze. #637

Closed saraphinesER closed 3 months ago

saraphinesER commented 3 months ago

Issue Type

Others

OS

Windows

onnx2tf version number

1.20.0

onnx version number

1.16,0

onnxruntime version number

1.17.1

onnxsim (onnx_simplifier) version number

0.4.33

tensorflow version number

2.13.0

Download URL for ONNX

https://drive.google.com/file/d/1RQ-hO8f9UIGvrSLVv2SCvCFXVSA1zJ2H/view?usp=drive_link

Parameter Replacement JSON

{ }

Description

  1. purpose: Personal study and development.
  2. What: check ONNX passed and try conversion. model = onnx.load(ONNXmodel_PATH) chkResults = onnx.checker.check_model(model) onnx2tf.convert(input_onnx_file_path=ONNXmodel_PATH, output_folder_path=ConvertH5model_PATHROOT, copy_onnx_input_output_names_to_tflite=True,

    param_replacement_file=PROJ_PATH + r'\outputs\replace.json'

            )

ERROR message

INFO: 202 / 330
INFO: onnx_op_type: Conv onnx_op_name: /Conv
INFO:  input_name.1: /Reshape_1_output_0 shape: [1, 1, 1109] dtype: float32
INFO:  input_name.2: /Cast_3_output_0 shape: [1, 1, 112] dtype: float32
INFO:  output_name.1: /Conv_output_0 shape: [1, 1, 1110] dtype: float32
INFO: tf_op_type: convolution_v2
INFO:  input.1.input: name: tf.compat.v1.pad_1/Pad:0 shape: (1, 1221, 1) dtype: <dtype: 'float32'> 
INFO:  input.2.weights: shape: (1, 1, 112) dtype: <dtype: 'float32'> 
INFO:  input.3.bias: 
INFO:  input.4.strides: val: [1] 
INFO:  input.5.dilations: val: [1] 
INFO:  input.6.padding: val: VALID 
INFO:  input.7.group: val: 1 
INFO:  output.1.output: name: tf.compat.v1.nn.conv1d/conv1d/Squeeze:0 shape: (1, 1221, 112) dtype: <dtype: 'float32'> 

INFO: 203 / 330
INFO: onnx_op_type: Slice onnx_op_name: /Slice
INFO:  input_name.1: /Conv_output_0 shape: [1, 1, 1110] dtype: float32
INFO:  input_name.2: /Constant_12_output_0 shape: [1] dtype: int64
INFO:  input_name.3: /Constant_13_output_0 shape: [1] dtype: int64
INFO:  input_name.4: /Constant_11_output_0 shape: [1] dtype: int64
INFO:  input_name.5: /Constant_14_output_0 shape: [1] dtype: int64
INFO:  output_name.1: /Slice_output_0 shape: [1, 1, 1109] dtype: float32
INFO: tf_op_type: strided_slice
INFO:  input.1.input_: name: tf.compat.v1.nn.conv1d/conv1d/Squeeze:0 shape: (1, 1221, 112) dtype: <dtype: 'float32'> 
INFO:  input.2.begin: shape: (1,) dtype: <dtype: 'int64'> 
INFO:  input.3.end: shape: (1,) dtype: <dtype: 'int64'> 
INFO:  input.4.strides: shape: (1,) dtype: <dtype: 'int64'> 
INFO:  output.1.output: name: tf.strided_slice/StridedSlice:0 shape: (1, 1221, 111) dtype: <dtype: 'float32'> 

INFO: 204 / 330
INFO: onnx_op_type: Reshape onnx_op_name: /Reshape_2
INFO:  input_name.1: /Slice_output_0 shape: [1, 1, 1109] dtype: float32
INFO:  input_name.2: /Constant_15_output_0 shape: [3] dtype: int64
INFO:  output_name.1: /Reshape_2_output_0 shape: [1, 1, 1109] dtype: float32
ERROR: The trace log is below.
Traceback (most recent call last):
  File "C:\ProgramData\miniconda3\envs\py39_tf\lib\site-packages\onnx2tf\utils\common_functions.py", line 310, in print_wrapper_func
    result = func(*args, **kwargs)
  File "C:\ProgramData\miniconda3\envs\py39_tf\lib\site-packages\onnx2tf\utils\common_functions.py", line 383, in inverted_operation_enable_disable_wrapper_func
    result = func(*args, **kwargs)
  File "C:\ProgramData\miniconda3\envs\py39_tf\lib\site-packages\onnx2tf\utils\common_functions.py", line 53, in get_replacement_parameter_wrapper_func
    func(*args, **kwargs)
  File "C:\ProgramData\miniconda3\envs\py39_tf\lib\site-packages\onnx2tf\ops\Reshape.py", line 254, in make_node
    tf.reshape(
  File "C:\ProgramData\miniconda3\envs\py39_tf\lib\site-packages\tensorflow\python\ops\weak_tensor_ops.py", line 88, in wrapper
    return op(*args, **kwargs)
  File "C:\ProgramData\miniconda3\envs\py39_tf\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\ProgramData\miniconda3\envs\py39_tf\lib\site-packages\keras\src\layers\core\tf_op_layer.py", line 119, in handle
    return TFOpLambda(op)(*args, **kwargs)
  File "C:\ProgramData\miniconda3\envs\py39_tf\lib\site-packages\keras\src\utils\traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
ValueError: Exception encountered when calling layer "tf.reshape_2" (type TFOpLambda).

Cannot reshape a tensor with 135531 elements to shape [1,1,1109] (1109 elements) for '{{node tf.reshape_2/Reshape}} = Reshape[T=DT_FLOAT, Tshape=DT_INT32](Placeholder, tf.reshape_2/Reshape/shape)' with input shapes: [1,111,1221], [3] and with input tensors computed as partial shapes: input[1] = [1,1,1109].

Call arguments received by layer "tf.reshape_2" (type TFOpLambda):
  • tensor=tf.Tensor(shape=(1, 111, 1221), dtype=float32)
  • shape=['1', '1', '1109']
  • name='/Reshape_2'
  1. How: I try to realize the example on https://github.com/PINTO0309/onnx2tf/issues/15. It looks similar but seems not solve. I tried to change "slice" dimension while not help. It seems my realization might be incorrect.
  2. Why: I want to learn and figure out how to convert a pytorch model->Onnx->Tflite. I thought I correctly export Onnx model since onnx.checker.check_model is passed.
  3. Resources: Use the traing script from https://github.com/facebookresearch/denoiser and convert to ONNX. Hugging face has the tflite format, and it's not rewrite by tensorflow framework, So I thought it should take benefit from onnx conversion.

Original I tried onnx-tensorflow(onnx_tf.backend) and it failed, then I found here and this approach seems more flexible and compatible, so I am really wondering I can correct use and make it work since I have several pytorch model needs to deploy with Tflite model. Thanks for your patience. I will be looking forward your guide to correctly convert ONNX to TFLite.

PINTO0309 commented 3 months ago

The structure of the ONNX file is broken before it can be converted by onnx2tf. Models that cannot be inferenced with onnxruntime should not be used.

sit4onnx -if best_bkuped.onnx -oep cpu

Traceback (most recent call last):
  File "/home/xxxx/.local/bin/sit4onnx", line 8, in <module>
    sys.exit(main())
  File "/home/xxxx/.local/lib/python3.10/site-packages/sit4onnx/onnx_inference_test.py", line 506, in main
    final_results = inference(
  File "/home/xxxx/.local/lib/python3.10/site-packages/sit4onnx/onnx_inference_test.py", line 241, in inference
    onnx_session = onnxruntime.InferenceSession(
  File "/home/xxxx/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/home/xxxx/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 472, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from best_bkuped.onnx failed:This is an invalid model. Type Error: Type 'tensor(float)' of input parameter (/Mul_3_output_0) of operator (Equal) in node (/Equal) is invalid.
onnxsim best_bkuped.onnx best_bkuped.onnx

Your model contains "Tile" ops or/and "ConstantOfShape" ops. Folding these ops can make the simplified model much larger. If it is not expected, please specify "--no-large-tensor" (which will lose 
some optimization chances)
Simplifying...
Traceback (most recent call last):
  File "/home/xxxx/.local/bin/onnxsim", line 8, in <module>
    sys.exit(main())
  File "/home/xxxx/.local/lib/python3.10/site-packages/onnxsim/onnx_simplifier.py", line 481, in main
    model_opt, check_ok = simplify(
  File "/home/xxxx/.local/lib/python3.10/site-packages/onnxsim/onnx_simplifier.py", line 199, in simplify
    model_opt_bytes = C.simplify(
  File "/home/xxxx/.local/lib/python3.10/site-packages/onnxsim/onnx_simplifier.py", line 252, in Run
    sess = rt.InferenceSession(
  File "/home/xxxx/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/home/xxxx/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 474, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Type Error: Type 'tensor(float)' of input parameter (/Mul_3_output_0) of operator (Equal) in node (/Equal) is invalid.
saraphinesER commented 3 months ago

Thanks for analysis and reply. Does this mean I do something wrong when export to ONNX model from pytorch? It's upset that seems I have to check more on ONNX details to realize where cause the issues. If it's not too much trouble, please give hints or guide if you probably knew or saw this kind of errors before. Still appreciate your comment, I will keep trying to check if I can find root cause to fix this INVALID_GRAPH.

PINTO0309 commented 3 months ago

Remove from PyTorch's logic the preprocessing that prevents shape estimation at these two locations. No meaningless and redundant processes should be left behind.

image

image

saraphinesER commented 3 months ago

Hi, sorry for late to response. Actually I saw your guide after you post while not fully understand how to do by your hint as I am a newbie. Then I tried to search around but still not know the direction to try, as you said "preprocessing that prevents shape estimation". Currently I can only share the pytorch model and json file, I am not sure if this helps. Just if it's not too much trouble, please give some hints or even some keywords that could help me to find some examples to learn more about the debug or direction to try & error. Thanks again for your kindly help on analysis. pytorch model history-json

PINTO0309 commented 3 months ago

If you share only .pth with me, I can't advise you on anything. Because the problem is the design of the model you used for your training. The ConstantOfShape [112] part should be a constant. However, depending on how the PyTorch logic is written, it may not generate ONNX correctly.

For example, statements such as x.shape[0] and x.shape[1] cause major problems. You might want to reconsider whether you really need the NoneZero, Sin, Div, Where syntax. NonZero is the most cancerous.

I assure you. Models with NonZero are garbage.

Frankly, it's quite a hassle to have to imagine everything and deal with you. I am not a free PyTorch advisor.

saraphinesER commented 3 months ago

Hi, PINTO0309, certainly you don't have to do so much. I am always appreciated to people who are willing to share and guide. Just there's a little misunderstanding here, I didn't mean to share a .pth model and make you confused. Let me explain a little more, actually this is a pytorch model from Facebook research in 2020, and the training script is as https://github.com/facebookresearch/denoiser/blob/main/train.py which I left in item-5 when raising question. I took this model to do some study and practice for Facebook sounds somehow representative. This reply is just for doing some clarification. You don't have to reply since it's already bothering you. I will try to check the hints above from you. Thanks again for your kindly guide during past couple days.