[Bug] optimizing a model based on CoatNet

miccio-dk commented 2 years ago

System information (version)

OpenVINO=> 2022.1.0
Operating System / Platform => Ubuntu 22.04.1 LTS
Compiler => N/A
Problem classification => Model optimization
Framework: Pytorch/ONNX
Model name: CoAtNet

Detailed description

When trying to optimize a model based on CoatNet (full model here) I am presented with the following error:

### OPENVINO MODEL OPTIMIZER COMMAND: 
/home/rmiccini/miniconda3/envs/speakerid_training/bin/mo --input_model /home/rmiccini/speakerid_models/onnx/2022-08-11_07-03-57_epoch_50.onnx --model_name 2022-08-11_07-03-57_epoch_50 --data_type FP16 --output_dir /home/rmiccini/speakerid_models/optimized --input "spectrograms[1 80 2048]" --output "embeddings"

### OPENVINO MODEL OPTIMIZER ERRORS: 
[ ERROR ]  -------------------------------------------------
[ ERROR ]  ----------------- INTERNAL ERROR ----------------
[ ERROR ]  Unexpected exception happened.
[ ERROR ]  Please contact Model Optimizer developers and forward the following information:
[ ERROR ]  While validating ONNX node '<Node(Add): Add_416>':
Check 'PartialShape::broadcast_merge_into(pshape, node->get_input_partial_shape(i), autob)' failed at core/src/op/util/elementwise_args.cpp:27:
While validating node 'v1::Add Add_4031 (onnx::Add_652[0]:f32{1,8,816,816}, onnx::Add_676[0]:f32{1,8,156,156}) -> (dynamic...)' with friendly_name 'Add_4031':
Argument shapes are inconsistent.

[ ERROR ]  Traceback (most recent call last):
  File "/home/rmiccini/miniconda3/envs/speakerid_training/lib/python3.9/site-packages/openvino/tools/mo/main.py", line 533, in main
    ret_code = driver(argv)
  File "/home/rmiccini/miniconda3/envs/speakerid_training/lib/python3.9/site-packages/openvino/tools/mo/main.py", line 489, in driver
    graph, ngraph_function = prepare_ir(argv)
  File "/home/rmiccini/miniconda3/envs/speakerid_training/lib/python3.9/site-packages/openvino/tools/mo/main.py", line 394, in prepare_ir
    ngraph_function = moc_pipeline(argv, moc_front_end)
  File "/home/rmiccini/miniconda3/envs/speakerid_training/lib/python3.9/site-packages/openvino/tools/mo/moc_frontend/pipeline.py", line 147, in moc_pipeline
    ngraph_function = moc_front_end.convert(input_model)
RuntimeError: While validating ONNX node '<Node(Add): Add_416>':
Check 'PartialShape::broadcast_merge_into(pshape, node->get_input_partial_shape(i), autob)' failed at core/src/op/util/elementwise_args.cpp:27:
While validating node 'v1::Add Add_4031 (onnx::Add_652[0]:f32{1,8,816,816}, onnx::Add_676[0]:f32{1,8,156,156}) -> (dynamic...)' with friendly_name 'Add_4031':
Argument shapes are inconsistent.

[ ERROR ]  ---------------- END OF BUG REPORT --------------
[ ERROR ]  -------------------------------------------------

Note that the forward pass of the Encoder model wrapper performs some custom padding to adjust the input shape to the size expected by the CoAtNet backbone. When I explicitly specify an input shape that matches the backbone and doesn't require any padding, i get this other message:

/home/rmiccini/miniconda3/envs/speakerid_training/bin/mo --input_model /home/rmiccini/speakerid_models/onnx/2022-08-11_07-03-57_epoch_50.onnx --model_name 2022-08-11_07-03-57_epoch_50 --data_type FP16 --output_dir /home/rmiccini/speakerid_models/optimized --input "spectrograms[1 96 416]" --output "embeddings"

### OPENVINO MODEL OPTIMIZER ERRORS: 
[ ERROR ]  -------------------------------------------------
[ ERROR ]  ----------------- INTERNAL ERROR ----------------
[ ERROR ]  Unexpected exception happened.
[ ERROR ]  Please contact Model Optimizer developers and forward the following information:
[ ERROR ]  Check '(scale_pshape.is_dynamic() || (scale_pshape.rank().is_static() && scale_pshape.rank().get_length() == 1 && data_pshape[1].compatible(scale_pshape[0])))' failed at frontends/onnx/frontend/src/op/instance_norm.cpp:55:
While validating ONNX node '<Node(InstanceNormalization): InstanceNormalization_2>':
Scale input must be one dimensional vector of number of input data channels size.

[ ERROR ]  Traceback (most recent call last):
  File "/home/rmiccini/miniconda3/envs/speakerid_training/lib/python3.9/site-packages/openvino/tools/mo/main.py", line 533, in main
    ret_code = driver(argv)
  File "/home/rmiccini/miniconda3/envs/speakerid_training/lib/python3.9/site-packages/openvino/tools/mo/main.py", line 489, in driver
    graph, ngraph_function = prepare_ir(argv)
  File "/home/rmiccini/miniconda3/envs/speakerid_training/lib/python3.9/site-packages/openvino/tools/mo/main.py", line 394, in prepare_ir
    ngraph_function = moc_pipeline(argv, moc_front_end)
  File "/home/rmiccini/miniconda3/envs/speakerid_training/lib/python3.9/site-packages/openvino/tools/mo/moc_frontend/pipeline.py", line 147, in moc_pipeline
    ngraph_function = moc_front_end.convert(input_model)
RuntimeError: Check '(scale_pshape.is_dynamic() || (scale_pshape.rank().is_static() && scale_pshape.rank().get_length() == 1 && data_pshape[1].compatible(scale_pshape[0])))' failed at frontends/onnx/frontend/src/op/instance_norm.cpp:55:
While validating ONNX node '<Node(InstanceNormalization): InstanceNormalization_2>':
Scale input must be one dimensional vector of number of input data channels size.

[ ERROR ]  ---------------- END OF BUG REPORT --------------
[ ERROR ]  -------------------------------------------------

Steps to reproduce

Instantiate the Encoder model
Export to ONNX
Compile using the CLI arguments shown above (in the code blocks)

For Encoder instantiation, use the following arguments:

model = Encoder(
  n_mels=80, 
  max_length=400, 
  num_blocks=[2, 2, 2, 2, 2], 
  channels=[48, 64, 96, 128, 192],
  block_types=['C', 'C', 'T', 'T'])

# sample input data of shape [batch_size, features, timeframes]
# may also try: (16, 96, 416) 
input_data = torch.randn(16, 80, 300)

Issue submission checklist

[x] I report the issue, it's not a question
[x] I checked the problem with documentation, FAQ, open issues, Stack Overflow, etc and have not found solution
[x] There is reproducer code and related data files: images, videos, models, etc.

Iffa-Intel commented 2 years ago

@miccio-dk I tried to run your python script but it didn't generate your model file. Would be great if you could share your ONNX model for us to validate from our end.

miccio-dk commented 2 years ago

Hi Iffa, thanks for looking into this. I just edited the script to include the model instantiation and export, please try and run the updated version (same link). I am also attaching the ONNX model for your convenience: enc.zip

Iffa-Intel commented 2 years ago

I had checked your model. The benchmark for native ONNX works and it seems to be supported by OpenVINO.

I managed to convert the ONNX into FP16 precision IR (without changing the shape)

The inferencing for FP16(without changing the shape) also works.

However, the static or dynamic conversion failed. The expected input shape is NCHW, so these are what I tried:

Dynamic conversion failed

Static conversion failed

I saw that you were trying to parse 3 shape values, so I tried that also but it also failed:

We'll further investigate this and get back to you with the possible workaround for static/dynamic IR conversion.

avitial commented 2 years ago

Hi Iffa, thanks for looking into this. I just edited the script to include the model instantiation and export, please try and run the updated version (same link). I am also attaching the ONNX model for your convenience: enc.zip

@miccio-dk I've taken a look at the provided onnx model file and such model has single input (with name input.1 and shape [16,80,300]) and single output (with name 2026 and shape [16,576,13]). I've no problems optimizing the model with Model Optimizer (using OpenVINO 2022.1 on Windows host). The following commands all seem to be working correctly, note the _inputshape specified in my last command I got from your generated ONNX model (enc.onnx). Do you observe similar error when optimizing such enc.onnx model with these commands?

$ mo --input_model enc.onnx --data_type FP16 --input input.1 --output 2026
$ mo --input_model enc.onnx --data_type FP16 --input input.1 --output 2026
$ mo --input_model enc.onnx --data_type FP16 --input_shape [16,80,300]

[ SUCCESS ] Generated IR version 11 model.
[ SUCCESS ] XML file: C:\Users\user\Documents\12543-optimizing_a_model_based_on_CoatNet\enc\enc.xml
[ SUCCESS ] BIN file: C:\Users\user\Documents\12543-optimizing_a_model_based_on_CoatNet\enc\enc.bin
[ SUCCESS ] Total execution time: 5.83 seconds.

If I try to use the input shape specified in your initial description (--input "spectrograms[1 80 2048]") with the same ONNX model I get a Runtime Error. Please check what is the expected input shape as defined in your model and make sure it matches when using the Model Optimizer. Hope this helps!

$ mo --input_model enc.onnx --data_type FP16 --input_shape [1,80,2048]

RuntimeError: While validating ONNX node '<Node(Add): Add_416>':
Check 'PartialShape::broadcast_merge_into(pshape, node->get_input_partial_shape(i), autob)' failed at C:\j\workspace\private-ci\ie\build-windows-vs2019@3\b\repos\openvino\src\core\src\op\util\elementwise_args.cpp:30:
While validating node 'v1::Add Add_4015 (onnx::Add_636[0]:f32{1,8,816,816}, onnx::Add_660[0]:f32{1,8,156,156}) -> (dynamic...)' with friendly_name 'Add_4015':
Argument shapes are inconsistent.

avitial commented 2 years ago

Closing this, I hope previous responses were sufficient to help you proceed. Feel free to reopen and provide additional information or ask any questions related to this topic.

miccio-dk commented 2 years ago

Sorry for the late reply, with the help provided above i indeed managed to export and optimize my model using openvino. However, when trying to compile the model for MYRIAD X and run inference on it I get the following:

Traceback (most recent call last):
  File "/home/rmiccini/speakerid_training_code/infer.py", line 12, in main
    infer(config)
  File "/home/rmiccini/speakerid_training_code/speakerid_training/pipelines/infer_pipeline.py", line 61, in infer
    compiled_model = core.compile_model(model, params.device)
  File "/home/rmiccini/intel/openvino_2022/python/python3.9/openvino/runtime/ie_api.py", line 266, in compile_model
    super().compile_model(model, device_name, {} if config is None else config)
RuntimeError: Failed to find reference implementation for `tensor.563` Layer with `GatherElements` Type on constant propagation

openvinotoolkit / openvino