microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.68k stars 2.93k forks source link

Getting error during inference with onnx built with openvino support #7317

Open gpradhanm opened 3 years ago

gpradhanm commented 3 years ago

Describe the bug I have a onnx model that works OK with Onnx runtime. When I tried to run inference with ONNX built with openvino support, I get the below error.

2021-04-12 14:06:54.345055587 [E:onnxruntime:, sequential_executor.cc:338 Execute] Non-zero status code returned while running OpenVINO-EP-subgraph_3 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_3_0' Status Message: /home/ubuntu/General_RnD/ONNX/onnxruntime/onnxruntime/core/providers/openvino/backend_utils.cc:113 std::shared_ptr<InferenceEngine::CNNNetwork> onnxruntime::openvino_ep::backend_utils::CreateCNNNetwork(const onnx::ModelProto&, const onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::openvino_ep::SubGraphContext&, std::map<std::__cxx11::basic_string<char>, std::shared_ptr<ngraph::Node> >&) [OpenVINO-EP] [OpenVINO-EP] Exception while importing model to nGraph Func: While validating ONNX node '<Node(Unsqueeze): Unsqueeze_16>':
Check '!axes.empty()' failed at ngraph/core/src/op/unsqueeze.cpp:63:
While validating node 'v0::Unsqueeze Unsqueeze_17 (72[0]:i64{}, Constant_16[0]:i64{0}) -> (dynamic?)' with friendly_name 'Unsqueeze_17':
'axes' input is mandatory.

Traceback (most recent call last):
  File "reproducible.py", line 16, in <module>
    ort_outs_enc = ort_session_enc.run(None, {'xs_pad':xs_pad})
  File "/home/ubuntu/.local/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 188, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_3 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_3_0' Status Message: /home/ubuntu/General_RnD/ONNX/onnxruntime/onnxruntime/core/providers/openvino/backend_utils.cc:113 std::shared_ptr<InferenceEngine::CNNNetwork> onnxruntime::openvino_ep::backend_utils::CreateCNNNetwork(const onnx::ModelProto&, const onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::openvino_ep::SubGraphContext&, std::map<std::__cxx11::basic_string<char>, std::shared_ptr<ngraph::Node> >&) [OpenVINO-EP] [OpenVINO-EP] Exception while importing model to nGraph Func: While validating ONNX node '<Node(Unsqueeze): Unsqueeze_16>':
Check '!axes.empty()' failed at ngraph/core/src/op/unsqueeze.cpp:63:
While validating node 'v0::Unsqueeze Unsqueeze_17 (72[0]:i64{}, Constant_16[0]:i64{0}) -> (dynamic?)' with friendly_name 'Unsqueeze_17':
'axes' input is mandatory.

System information

To Reproduce I have attached a code with weight file for error reproduction https://we.tl/t-LAuSuS7dv0

hariharans29 commented 3 years ago

cc: @jywu-msft @MaajidKhan

MaajidKhan commented 3 years ago

Hi @gpradhanm. I assume, you are using older version of OpenVINO with ONNXRuntime. can you use the latest OpenVINO Version 2021.3 and try re-running the python script.

so with the latest OpenVINO 2021.3, your model would be successfully inferred via subgraph partitioning using OpenVINO-EP Execution provider.

Here are the sample screenshots of the output in debug build mode and with release build mode.

output_release_mode_ov_2021 3 output_debug_mode_ov_2021 3
gpradhanm commented 3 years ago

@MaajidKhan Thanks for your quick response. The above particular issue is resolved. But when I am going to infer with mutiple models, getting the following error -

Traceback (most recent call last): File "RecogONNX.py", line 355, in ort_outs_enc = ort_session_enc.run(None, {'xs_pad':xs_pad}) File "/home/ubuntu/.local/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 188, in run return self._sess.run(output_names, input_feed, run_options) onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Concat node. Name:'Concat_19' Status Message: /home/ubuntu/General_RnD/ONNX/onnxruntime/onnxruntime/core/providers/cpu/tensor/concat.cc:72 onnxruntime::common::Status onnxruntime::ConcatBase::PrepareForCompute(onnxruntime::OpKernelContext, const std::vector<const onnxruntime::Tensor>&, onnxruntime::Prepare&) const inputs_n_rank == inputs_0_rank was false. Ranks of input data are different, cannot concatenate them. expected rank: 1 got: 2

Sample code snipet is as follows - image

MaajidKhan commented 3 years ago

@gpradhanm your code snippet looks like has small typos.

It should be: ort_session_enc = onnxruntime.InferenceSession("ONNX_Models/Encoder.onnx",options)

you are passing options1.

Make sure, for any model you are loading using onnxruntime.InferenceSession and then when you pass the data to this model , the array should have values of the input_shape of the model

Example: Say, I am trying to pass an random array as data to an alexnet model. The array should be of the shape = input_shape of the alexnet_model, which is 1,3,224,224

Example: xs_pad1 = np.array(np.random.rand(1,3,224,224), dtype=np.float32)

Attaching here a code snippet, where I am trying to load multiple models and infer them.

model1.txt

Here's the output:

image

gpradhanm commented 3 years ago

Hi @MaajidKhan, Sorry for providing the wrong snippet. In my actual script, there is no such typo. Following your code snippet, I created the following code and it is working fine - import onnx import onnxruntime import numpy as np options = onnxruntime.SessionOptions() options.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_DISABLE_ALL xs_pad = np.array(np.random.rand(1,1009,83), dtype=np.float32) print("xs_pad: ", xs_pad)

compute ONNX Runtime output prediction

ort_session_enc = onnxruntime.InferenceSession("ONNX_Models/Encoder.onnx",options)

ort_session_enc.set_providers(['OpenVINOExecutionProvider'], [{'device_type' : 'CPU_FP32'}])

ort_outs_enc = ort_session_enc.run(None, {ort_session_enc.get_inputs()[0].name: xs_pad})

sess = onnxruntime.InferenceSession("ONNX_Models/ctc_lo.onnx", options) sess.set_providers(['OpenVINOExecutionProvider'], [{'device_type' : 'CPU_FP32'}])

ort_outs_enc = ort_session_enc.run(None, {ort_session_enc.get_inputs()[0].name: xs_pad})

ort_outs_encq = sess.run(None, {'h': ort_outs_enc[0]})

print("/n") print("ort_outs_enc: ") print(ort_outs_enc)

print("/n") print("ort_outs_encq: ") print(ort_outs_encq)

But the following code is not working and giving the same error -

import onnx import onnxruntime import numpy as np options = onnxruntime.SessionOptions() options.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_DISABLE_ALL xs_pad = np.array(np.random.rand(1,1009,83), dtype=np.float32) print("xs_pad: ", xs_pad) ort_session_enc = onnxruntime.InferenceSession("ONNX_Models/Encoder.onnx",options)

ort_session_enc.set_providers(['OpenVINOExecutionProvider'], [{'device_type' : 'CPU_FP32'}])

ort_outs_enc = ort_session_enc.run(None, {ort_session_enc.get_inputs()[0].name: xs_pad})

sess = onnxruntime.InferenceSession("ONNX_Models/ctc_lo.onnx", options) sess.set_providers(['OpenVINOExecutionProvider'], [{'device_type' : 'CPU_FP32'}])

ort_outs_enc = ort_session_enc.run(None, {ort_session_enc.get_inputs()[0].name: xs_pad}) ort_outs_encq = sess.run(None, {'h': ort_outs_enc[0]})

print("/n") print("ort_outs_enc: ") print(ort_outs_enc)

print("/n") print("ort_outs_encq: ") print(ort_outs_encq)

It is giving the following error - image

In my inferrencing code base, I need to define all the sessions together and run them wherever they are required in the script.

Could you please help me in this case?

gpradhanm commented 3 years ago

@MaajidKhan Any update on the above query ?

MaajidKhan commented 3 years ago

@gpradhanm Ideally, it should work. If you try to run the same below code which is giving error with the default CPU Execution provider, it will work since all the models will be fully supported and fully inferred on default CPU.

But in your case, you are using OpenVINO Execution Provider for ONNXRuntime and inferring the models using OpenVINO Backend on CPU.

case 1: If all the models that you are using to infer are fully supported by OpenVINO Execution provider, then you wouldn't see any error and you will get the output for all the models from your script.

case 2: since, in your case, there are a few models that are not supported fully on OpenVINO-EP, like Encoder.onnx. Remember when a model is not fully supported on OpenVINO-EP, we infer the models using sub-graph partitioning.

The part of the graph which is supported by OpenVINO-EP is inferred using the OpenVINO-EP and the remaining part is inferred using the default CPU and later the results are concatenated before giving out the final end result to the user.

Now what is happening in your script is:

First, you are creating InferenceSession() for all your models. onnxruntime.InferenceSession( model_name, options )

what this does is creates a session for that particular model on that particular hardware and have it ready. But when at the end you are trying to infer the model's separately wherever required and say that model is partially supported, looks like there's a mismatch while concatenating different subgraph's because there were multiple sessions created for different models.

That's why you are getting this error.

Note: (Just Additional Information) For the Encoder.onnx model, currently, there are 6 subgraphs that are being formed with sub graph partitioning.

work around 1: your script will work, if you have all the models fully supported fully on OpenVINO-EP.

work around 2: If there are some models that are partially supported on OpenVINO-EP, like Encoder.onnx. I would suggest you to create the InferenceSession() and run that session right after that in the code.

something like this. ort_session_enc = onnxruntime.InferenceSession("Encoder.onnx",options) ort_session_enc.set_providers(['OpenVINOExecutionProvider'], [{'device_type' : 'CPU_FP32'}]) ort_outs_enc = ort_session_enc.run(None, {ort_session_enc.get_inputs()[0].name: xs_pad})

stale[bot] commented 2 years ago

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.