microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.05k stars 2.83k forks source link

OpenVINO EP inferencing issue for some combinations of model/platforms/device #4262

Open danielecazzari opened 4 years ago

danielecazzari commented 4 years ago

I’m currently testing the following models using OpenVINO Execution Provider: • Tiny YoloV2 from ONNX Model Zoo https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/tiny-yolov2 • Tiny YoloV3 from ONNX Model Zoo https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/tiny-yolov3 • Faster RCNN from ONNX Model Zoo https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/faster-rcnn • Human Pose Estimation from OpenVINO Model Zoo (Public Models) https://github.com/opencv/open_model_zoo/blob/master/models/public/human-pose-estimation-3d-0001/description/human-pose-estimation-3d-0001.md On this different platform: • Windows 10 Laptop with i7-8650U • IEI Tank Ubuntu 16 with i7-6700 and mustang card (VAD-M_FP16) • Azure NC6 Ubuntu 18 with Xeon E5-2690 And I found the below issue on some configuration and I would like to check with you if they’re expected or not.

Additionally, I notice some issue on Python closing phase (DLL Detach) on windows where it crashes mostly systematically for GPU inferencing (sometimes it’s enough to load the model, other time you need to inference also something) and on Linux where VAD-M_FP16 hangs on exit (CTRL+C is needed).

Model inferencing issues:

File "/usr/local/lib/python3.6/dist-packages/onnxruntime/capi/session.py", line 177, in _load_model
    self._sess.load_model(providers)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: /data/onnx/onnxruntime-1.3.0/onnxruntime/core/providers/openvino/backend_utils.cc:69 std::shared_ptr<InferenceEngine::CNNNetwork> onnxruntime::openvino_ep::backend_utils::CreateCNNNetwork(const onnx::ModelProto&, std::__cxx11::string, InferenceEngine::Precision) [OpenVINO-EP]  Exception thrown while making IE::CNNNetwork: All operations in nGraph function should have unique friendly names!
File "/usr/local/lib/python3.5/dist-packages/onnxruntime/capi/session.py", line 111, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_4 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_4_3' Status Message: /data/onnx/onnxruntime-1.3.0/onnxruntime/core/providers/openvino/backends/basic_backend.cc:41 onnxruntime::openvino_ep::BasicBackend::BasicBackend(const onnx::ModelProto&, onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::openvino_ep::SubGraphContext&) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_4_3Error has occured for: eltwise:TFNodes/yolo_evaluation_layer_1/truediv_8:0
Sizes equal or broadcast is possible(true) should be false
Invalid input shapes
jywu-msft commented 4 years ago

thanks! some of these models there are known issues in OpenVINO, such as faster-rcnn, tiny-yolov3 FYI @smkarlap @suryasidd , can you cross check this error report with your internal testing?

suryasidd commented 4 years ago

Hey George, i have cross checked these errors with our internal testing and most of the errors are the same that we see with faster-rcnn, tinyyolov3 and yolov3. We have not run the Human pose estimation model through onnxruntime, i will run it and check

stale[bot] commented 3 years ago

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

stale[bot] commented 3 years ago

This issue has been automatically closed due to inactivity. Please reactivate if further support is needed.

manashgoswami commented 3 years ago

@suryasidd can you please update status of these models executing with EP?

suryasidd commented 3 years ago

Hey @manashgoswami with 2020.4 release, This is the status of the models mentioned in this ticket.

Model Name CPU_FP32 GPU_FP32 GPU_FP16 MYRIAD
TinyYolv3 Working Not Working Not Working Working
Faster RCNN Working Working Not Working Working

Hey @danielecazzari can you please let me know where can i download the human pose estimation ONNX model from?

suryasidd commented 3 years ago

Hey @danielecazzari with 2020.4 release I have verified that human pose estimation model works on all configurations on Linux. Please try it with the latest release and let me know if you still have the problem. If not we can close this issue

danielecazzari commented 3 years ago

Hi, I do confirm that human pose estimation works now both on windows and Linux CPU (Azure) still need to test on IEI Tank. I also confirm TinyYolv3 and Faster RCNN issues. In addition, I also tested ssd model, that does not work on any platform/Device with this error:

onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: D:\onnx\onnxruntime-1.5.1\onnxruntime\core\providers\openvino\backends\basic_backend.cc:67 __cdecl onnxruntime::openvino_ep::BasicBackend::BasicBackend(const class onnx::ModelProto &,struct onnxruntime::openvino_ep::GlobalContext &,const struct onnxruntime::openvino_ep::SubGraphContext &) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0: Check 'false' failed at C:\j\workspace\private-ci\ie\build-windows-icc2018\b\repos\openvino\ngraph\src\ngraph\node.cpp:114:
While validating node 'v1::VariadicSplit VariadicSplit_3609(Transpose_2600[0]:f32{1,15130,4}, Constant_3607[0]:i64{}, Constant_3608[0]:i64{2}) -> (f32{1,15130,2}, f32{1,15130,2})':
Default output not supported

Regards,

Daniele

addisonklinke commented 3 years ago

@daniele-pizziconi Could you please share the OpenVino model optimizer command you used to produce the IR files for the Tiny Yolov3 model? Specifically, any input shape and name arguments. I downloaded the associated ONNX model from your link and ran the following conversion with OpenVino 2021.2

python3 mo.py --input_model tiny-yolov3-11.onnx --progress

However I get an error

Model Optimizer arguments:
Common parameters:
    - Path to the Input Model:  tiny-yolov3-11.onnx
    - Path for generated IR:    /opt/intel/openvino_2021.2.185/deployment_tools/model_optimizer/.
    - IR output name:   tiny-yolov3-11
    - Log level:    ERROR
    - Batch:    Not specified, inherited from the model
    - Input layers:     Not specified, inherited from the model
    - Output layers:    Not specified, inherited from the model
    - Input shapes:     Not specified, inherited from the model
    - Mean values:  Not specified
    - Scale values:     Not specified
    - Scale factor:     Not specified
    - Precision of IR:  FP32
    - Enable fusing:    True
    - Enable grouped convolutions fusing:   True
    - Move mean values to preprocess section:   None
    - Reverse input channels:   False
ONNX specific parameters:
Model Optimizer version:    2021.2.0-1877-176bdf51370-releases/2021/2
Progress: [.......             ]  36.49% done[ ERROR ]  Cannot infer shapes or values for node "TFNodes/yolo_evaluation_layer_1/Squeeze".
[ ERROR ]  Trying to squeeze dimension not equal to 1 for node "TFNodes/yolo_evaluation_layer_1/Squeeze"
[ ERROR ]  
[ ERROR ]  It can happen due to bug in custom shape infer function <function Squeeze.infer at 0x7fc832ef39d8>.
[ ERROR ]  Or because the node inputs have incorrect values/shapes.
[ ERROR ]  Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ]  Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ]  Exception occurred during running replacer "REPLACEMENT_ID" (<class 'extensions.middle.PartialInfer.PartialInfer'>): Stopped shape/value propagation at "TFNodes/yolo_evaluation_layer_1/Squeeze" node. 
 For more information please refer to Model Optimizer FAQ, question #38. (https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_prepare_model_Model_Optimizer_FAQ.html?question=38#question-38)