microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.38k stars 2.89k forks source link

ONNXRUNTIME + OpenVINO on ARM64 #11582

Open thestarivore opened 2 years ago

thestarivore commented 2 years ago

Is your feature request related to a problem? Please describe. Hi, I'm trying to get ONNXRUNTIME + OpenVINO (+ONNX) to work on a Raspberry PI 4B with a ARM64 OS (Raspbian Bullseye). But I can't find any good guide. I'm building onnxruntime and openvino from scratch and they work individually, but I can't get to make onnxruntime recognize the OpenVino EP. Every time I try to follow the official instructions, everything on my system is different than it's described in the instructions and the OpenVINO's setupvars.sh script never works (never finds python).

System information

Describe the solution you'd like It would be great to have an already compiled packet or a tested installation process that is known to work on ARM64. A Raspberry PI is the simplest embedded device one can consider.

Describe alternatives you've considered I managed to make it work on Raspbian Buster (ARM32) with the following instructions: article But unfortunately I can't use an ARM32 OS since I have other packets to install that only work on ARM64.

I've followed the the same installation steps of daves003 in issues/6057, but again I can't link OpenVINO to onnxruntime.

Does someone managed to make ONNXRUNTIME + OpenVINO work on an ARM64 device?

Thank you in advance.

jywu-msft commented 2 years ago

This used to work long time ago but seems to have been broken somewhere along the way. The most recent attempt to get it working was documented in https://github.com/microsoft/onnxruntime/issues/8285

thestarivore commented 2 years ago

@jw-msft thank you, I've managed to make it work. Is there any way to make inference via onnxruntime + OpenVINO while using the optimized models .bin .xml instead of the .onnx model? Because right now I don't see a huge difference between running the model via CPU and via OpenVino EP (on NCS2). It king of defeats the purpose of using OpenVINO EP.

jywu-msft commented 2 years ago

@jw-msft thank you, I've managed to make it work. Is there any way to make inference via onnxruntime + OpenVINO while using the optimized models .bin .xml instead of the .onnx model? Because right now I don't see a huge difference between running the model via CPU and via OpenVino EP (on NCS2). It king of defeats the purpose of using OpenVINO EP.

That's great. Were there any special/additional steps you had to do? or did the instructions in that linked issue work directly? re: .bin/.xml OpenVINO format. Unfortunately, no. OnnxRuntime only accepts .onnx files as input. The benefit of using OnnxRuntime is OpenVINO does not support all ONNX operators. OnnxRuntime + OpenVINO can support more models than native OpenVINO by falling back to our CPU implementations for the unsupported subgraphs. Also using OnnxRuntime allows one to have a consistent api yet deploy on multiple hardware targets by enabling a supported Execution Providers.