OpenVINO EP inferencing issue for some combinations of model/platforms/device

danielecazzari commented 4 years ago

I’m currently testing the following models using OpenVINO Execution Provider: • Tiny YoloV2 from ONNX Model Zoo https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/tiny-yolov2 • Tiny YoloV3 from ONNX Model Zoo https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/tiny-yolov3 • Faster RCNN from ONNX Model Zoo https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/faster-rcnn • Human Pose Estimation from OpenVINO Model Zoo (Public Models) https://github.com/opencv/open_model_zoo/blob/master/models/public/human-pose-estimation-3d-0001/description/human-pose-estimation-3d-0001.md On this different platform: • Windows 10 Laptop with i7-8650U • IEI Tank Ubuntu 16 with i7-6700 and mustang card (VAD-M_FP16) • Azure NC6 Ubuntu 18 with Xeon E5-2690 And I found the below issue on some configuration and I would like to check with you if they’re expected or not.

Additionally, I notice some issue on Python closing phase (DLL Detach) on windows where it crashes mostly systematically for GPU inferencing (sometimes it’s enough to load the model, other time you need to inference also something) and on Linux where VAD-M_FP16 hangs on exit (CTRL+C is needed).

Model inferencing issues:

Human Pose Estimation. Works on all configuration on Windows Laptop, fails on all configuration on both Linux deployments (IEI Tank and Azure) with the same error (see full logs attached):

File "/usr/local/lib/python3.6/dist-packages/onnxruntime/capi/session.py", line 177, in _load_model
    self._sess.load_model(providers)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: /data/onnx/onnxruntime-1.3.0/onnxruntime/core/providers/openvino/backend_utils.cc:69 std::shared_ptr<InferenceEngine::CNNNetwork> onnxruntime::openvino_ep::backend_utils::CreateCNNNetwork(const onnx::ModelProto&, std::__cxx11::string, InferenceEngine::Precision) [OpenVINO-EP]  Exception thrown while making IE::CNNNetwork: All operations in nGraph function should have unique friendly names!

Tiny YoloV3 could be scored only on CPU_FP32. _All other configuration GPU_FP16, GPU_FP32, MYRIAD_FP16, and VAD-MFP16 return errors (see full logs attached).

GPU_FP16 (Win Laptop and IEI Tank)

File "/usr/local/lib/python3.5/dist-packages/onnxruntime/capi/session.py", line 111, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_4 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_4_3' Status Message: Check 'element::Type::merge(element_type, element_type, get_input_element_type(i))' failed at /data/onnx/onnxruntime-1.3.0/build/Linux/Release/ngraph/src/project_ngraph/src/ngraph/node.cpp:889:
While validating node 'v1::Add Add_8653(Parameter_8649[0]:f16{1,26,26,3,2}, Convert_8652[0]:f32{26,26,1,2}) -> (f32{1,26,26,3,2})':
Argument element types are inconsistent.

GPU_FP32 (Win Laptop and IEI Tank)

File "/usr/local/lib/python3.5/dist-packages/onnxruntime/capi/session.py", line 111, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_4 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_4_3' Status Message: /data/onnx/onnxruntime-1.3.0/onnxruntime/core/providers/openvino/backends/basic_backend.cc:41 onnxruntime::openvino_ep::BasicBackend::BasicBackend(const onnx::ModelProto&, onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::openvino_ep::SubGraphContext&) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_4_3Error has occured for: eltwise:TFNodes/yolo_evaluation_layer_1/truediv_8:0
Sizes equal or broadcast is possible(true) should be false
Invalid input shapes

MYRIAD_FP16 (Win Laptop)

File "C:\Python\Python37_64\lib\site-packages\onnxruntime\capi\session.py", line 111, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_7 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_7_6' Status Message: D:\onnx\onnxruntime-1.3.0\onnxruntime\core\providers\openvino\backends\basic_backend.cc:41 __cdecl onnxruntime::openvino_ep::BasicBackend::BasicBackend(const class onnx::ModelProto &,struct onnxruntime::openvino_ep::GlobalContext &,const struct onnxruntime::openvino_ep::SubGraphContext &) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_7_6Input Parameter_12898 has unexpected layout C

VAD-M_FP16 (IEI Tank))

File "/usr/local/lib/python3.5/dist-packages/onnxruntime/capi/session.py", line 111, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_26 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_26_25' Status Message: /data/onnx/onnxruntime-1.3.0/onnxruntime/core/providers/openvino/backends/vadm_backend.cc:64 onnxruntime::openvino_ep::VADMBackend::VADMBackend(const onnx::ModelProto&, onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::openvino_ep::SubGraphContext&) [OpenVINO-EP] Exception while creating InferRequest object: AssertionFailed: hddlBlob->getSize() >= offset + size

Faster RCNN could be scored only on CPU_FP32. _All other configuration GPU_FP16, GPU_FP32, MYRIAD_FP16, and VAD-M_FP16 return errors (see full logs attached).

GPU_FP32 & GPU_FP16 (Win Laptop and IEI Tank)

File "/usr/local/lib/python3.5/dist-packages/onnxruntime/capi/session.py", line 177, in _load_model
self._sess.load_model(providers)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: /data/onnx/onnxruntime-1.3.0/onnxruntime/core/providers/openvino/backends/basic_backend.cc:41 onnxruntime::openvino_ep::BasicBackend::BasicBackend(const onnx::ModelProto&, onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::openvino_ep::SubGraphContext&) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_41_40Error has occured for: concat:415
Input size dim: 1(=2) is not equal to: input memory dim: 1(=1)
Every input must have the same size

MYRIAD_FP16 (Win Laptop) (maybe is a too big model for the myriad)

File "C:\Python\Python37_64\lib\site-packages\onnxruntime\capi\session.py", line 111, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_8 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_8_7' Status Message: D:\onnx\onnxruntime-1.3.0\onnxruntime\core\providers\openvino\backends\basic_backend.cc:41 __cdecl onnxruntime::openvino_ep::BasicBackend::BasicBackend(const class onnx::ModelProto &,struct onnxruntime::openvino_ep::GlobalContext &,const struct onnxruntime::openvino_ep::SubGraphContext &) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_8_7Failed to create input FIFO: NC_OUT_OF_MEMORY

VAD-M_FP16 (IEI Tank)

File "/usr/local/lib/python3.5/dist-packages/onnxruntime/capi/session.py", line 111, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_17 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_17_16' Status Message: /data/onnx/onnxruntime-1.3.0/onnxruntime/core/providers/openvino/backends/vadm_backend.cc:52 onnxruntime::openvino_ep::VADMBackend::VADMBackend(const onnx::ModelProto&, onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::openvino_ep::SubGraphContext&) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_17_16Input Parameter_1539 has unexpected layout CHW

ErrorReport.zip

jywu-msft commented 4 years ago

thanks! some of these models there are known issues in OpenVINO, such as faster-rcnn, tiny-yolov3 FYI @smkarlap @suryasidd , can you cross check this error report with your internal testing?

suryasidd commented 4 years ago

Hey George, i have cross checked these errors with our internal testing and most of the errors are the same that we see with faster-rcnn, tinyyolov3 and yolov3. We have not run the Human pose estimation model through onnxruntime, i will run it and check

stale[bot] commented 3 years ago

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

stale[bot] commented 3 years ago

This issue has been automatically closed due to inactivity. Please reactivate if further support is needed.

manashgoswami commented 3 years ago

@suryasidd can you please update status of these models executing with EP?

suryasidd commented 3 years ago

Hey @manashgoswami with 2020.4 release, This is the status of the models mentioned in this ticket.

Model Name	CPU_FP32	GPU_FP32	GPU_FP16	MYRIAD
TinyYolv3	Working	Not Working	Not Working	Working
Faster RCNN	Working	Working	Not Working	Working

Hey @danielecazzari can you please let me know where can i download the human pose estimation ONNX model from?

suryasidd commented 3 years ago

Hey @danielecazzari with 2020.4 release I have verified that human pose estimation model works on all configurations on Linux. Please try it with the latest release and let me know if you still have the problem. If not we can close this issue

danielecazzari commented 3 years ago

Hi, I do confirm that human pose estimation works now both on windows and Linux CPU (Azure) still need to test on IEI Tank. I also confirm TinyYolv3 and Faster RCNN issues. In addition, I also tested ssd model, that does not work on any platform/Device with this error:

onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: D:\onnx\onnxruntime-1.5.1\onnxruntime\core\providers\openvino\backends\basic_backend.cc:67 __cdecl onnxruntime::openvino_ep::BasicBackend::BasicBackend(const class onnx::ModelProto &,struct onnxruntime::openvino_ep::GlobalContext &,const struct onnxruntime::openvino_ep::SubGraphContext &) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0: Check 'false' failed at C:\j\workspace\private-ci\ie\build-windows-icc2018\b\repos\openvino\ngraph\src\ngraph\node.cpp:114:
While validating node 'v1::VariadicSplit VariadicSplit_3609(Transpose_2600[0]:f32{1,15130,4}, Constant_3607[0]:i64{}, Constant_3608[0]:i64{2}) -> (f32{1,15130,2}, f32{1,15130,2})':
Default output not supported

Regards,

Daniele

addisonklinke commented 3 years ago

@daniele-pizziconi Could you please share the OpenVino model optimizer command you used to produce the IR files for the Tiny Yolov3 model? Specifically, any input shape and name arguments. I downloaded the associated ONNX model from your link and ran the following conversion with OpenVino 2021.2

python3 mo.py --input_model tiny-yolov3-11.onnx --progress

However I get an error

Model Optimizer arguments:
Common parameters:
    - Path to the Input Model:  tiny-yolov3-11.onnx
    - Path for generated IR:    /opt/intel/openvino_2021.2.185/deployment_tools/model_optimizer/.
    - IR output name:   tiny-yolov3-11
    - Log level:    ERROR
    - Batch:    Not specified, inherited from the model
    - Input layers:     Not specified, inherited from the model
    - Output layers:    Not specified, inherited from the model
    - Input shapes:     Not specified, inherited from the model
    - Mean values:  Not specified
    - Scale values:     Not specified
    - Scale factor:     Not specified
    - Precision of IR:  FP32
    - Enable fusing:    True
    - Enable grouped convolutions fusing:   True
    - Move mean values to preprocess section:   None
    - Reverse input channels:   False
ONNX specific parameters:
Model Optimizer version:    2021.2.0-1877-176bdf51370-releases/2021/2
Progress: [.......             ]  36.49% done[ ERROR ]  Cannot infer shapes or values for node "TFNodes/yolo_evaluation_layer_1/Squeeze".
[ ERROR ]  Trying to squeeze dimension not equal to 1 for node "TFNodes/yolo_evaluation_layer_1/Squeeze"
[ ERROR ]  
[ ERROR ]  It can happen due to bug in custom shape infer function <function Squeeze.infer at 0x7fc832ef39d8>.
[ ERROR ]  Or because the node inputs have incorrect values/shapes.
[ ERROR ]  Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ]  Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ]  Exception occurred during running replacer "REPLACEMENT_ID" (<class 'extensions.middle.PartialInfer.PartialInfer'>): Stopped shape/value propagation at "TFNodes/yolo_evaluation_layer_1/Squeeze" node. 
 For more information please refer to Model Optimizer FAQ, question #38. (https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_prepare_model_Model_Optimizer_FAQ.html?question=38#question-38)

microsoft / onnxruntime

OpenVINO EP inferencing issue for some combinations of model/platforms/device #4262