lsusb | grep Myriad
Bus 003 Device 012: ID 03e7:2485 Intel Movidius MyriadX
To reproduce
On a device with a MyriadX, install onnxrumtime-openvino
Download and unzip python.zip, which is a sample compact domain not S1.
Run the command: python3 ./onnxruntime_predict.py sample.jpg and note the output and performance around 2442 MS (shown below):
python3 ./onnxruntime_predict.py sample.jpg
2023-01-04 15:52:55.919507021 [I:onnxruntime:, inference_session.cc:263 operator()] Flush-to-zero and denormal-as-zero are off
2023-01-04 15:52:55.919580746 [I:onnxruntime:, inference_session.cc:271 ConstructorCommon] Creating and using per session threadpools since use_per_sessionthreads is true
2023-01-04 15:52:55.919608785 [I:onnxruntime:, inference_session.cc:292 ConstructorCommon] Dynamic block base set to 0
2023-01-04 15:52:56.007347026 [I:onnxruntime:, inference_session.cc:1222 Initialize] Initializing session.
2023-01-04 15:52:56.007413914 [I:onnxruntime:, inference_session.cc:1259 Initialize] Adding default CPU execution provider.
2023-01-04 15:52:56.007488562 [I:onnxruntime:, session_state.cc:31 SetupAllocators] Allocator already registered for OrtMemoryInfo:[name:Cpu id:0 OrtMemType:0 OrtAllocatorType:1 Device:[DeviceType:0 MemoryType:0 DeviceId:0]]. Ignoring allocator from CPUExecutionProvider
2023-01-04 15:52:56.012267211 [I:onnxruntime:, reshape_fusion.cc:42 ApplyImpl] Total fused reshape node count: 0
2023-01-04 15:52:56.013809384 [V:onnxruntime:, session_state.cc:1010 VerifyEachNodeIsAssignedToAnEp] Node placements
2023-01-04 15:52:56.013874785 [V:onnxruntime:, session_state.cc:1013 VerifyEachNodeIsAssignedToAnEp] All nodes placed on [OpenVINOExecutionProvider]. Number of nodes: 1
2023-01-04 15:52:56.013941934 [V:onnxruntime:, session_state.cc:66 CreateGraphInfo] SaveMLValueNameIndexMapping
2023-01-04 15:52:56.013984973 [V:onnxruntime:, session_state.cc:112 CreateGraphInfo] Done saving OrtValue mappings.
2023-01-04 15:52:56.014120618 [I:onnxruntime:, session_state_utils.cc:199 SaveInitializedTensors] Saving initialized tensors.
2023-01-04 15:52:56.014224907 [I:onnxruntime:, session_state_utils.cc:286 SaveInitializedTensors] [Memory] SessionStateInitializer statically allocates 22075136 bytes for OpenVINO_CPU
2023-01-04 15:52:56.034266935 [I:onnxruntime:, session_state_utils.cc:342 SaveInitializedTensors] Done saving initialized tensors
2023-01-04 15:52:56.034401557 [I:onnxruntime:, inference_session.cc:1488 Initialize] Session successfully initialized.
2023-01-04 15:52:56.161128992 [I:onnxruntime:, sequential_executor.cc:176 Execute] Begin execution
2023-01-04 15:52:58.496444241 [W:onnxruntime:, execution_frame.cc:828 VerifyOutputSizes] Expected shape from model of {-1,50,13,13} does not match actual shape of {1,50,12,22} for output model_outputs0
2442.52 MS
[{'probability': 0.69025356, 'tagId': 2, 'tagName': 'MailTruck', 'boundingBox': {'left': 0.24598908, 'top': 0.50931931, 'width': 0.07350751, 'height': 0.11108013}}, {'probability': 0.11874406, 'tagId': 3, 'tagName': 'Other', 'boundingBox': {'left': 0.54359399, 'top': 0.60347093, 'width': 0.13043485, 'height': 0.17345715}}]
Change line 30 of onnxrumtime_predicy.py to remove the provider_options and rerun the command python3 ./onnxruntime_predict.py sample.jpg and note the 703 MS shown below:
python3 ./onnxruntime_predict.py sample.jpg
2023-01-04 15:55:27.727558718 [I:onnxruntime:, inference_session.cc:263 operator()] Flush-to-zero and denormal-as-zero are off
2023-01-04 15:55:27.727633811 [I:onnxruntime:, inference_session.cc:271 ConstructorCommon] Creating and using per session threadpools since use_per_sessionthreads is true
2023-01-04 15:55:27.727662164 [I:onnxruntime:, inference_session.cc:292 ConstructorCommon] Dynamic block base set to 0
2023-01-04 15:55:27.815505720 [I:onnxruntime:, inference_session.cc:1222 Initialize] Initializing session.
2023-01-04 15:55:27.815576300 [I:onnxruntime:, inference_session.cc:1259 Initialize] Adding default CPU execution provider.
2023-01-04 15:55:27.815620865 [I:onnxruntime:, session_state.cc:31 SetupAllocators] Allocator already registered for OrtMemoryInfo:[name:Cpu id:0 OrtMemType:0 OrtAllocatorType:1 Device:[DeviceType:0 MemoryType:0 DeviceId:0]]. Ignoring allocator from CPUExecutionProvider
2023-01-04 15:55:27.820222957 [I:onnxruntime:, reshape_fusion.cc:42 ApplyImpl] Total fused reshape node count: 0
2023-01-04 15:55:27.821672392 [V:onnxruntime:, session_state.cc:1010 VerifyEachNodeIsAssignedToAnEp] Node placements
2023-01-04 15:55:27.821724433 [V:onnxruntime:, session_state.cc:1013 VerifyEachNodeIsAssignedToAnEp] All nodes placed on [OpenVINOExecutionProvider]. Number of nodes: 1
2023-01-04 15:55:27.821758881 [V:onnxruntime:, session_state.cc:66 CreateGraphInfo] SaveMLValueNameIndexMapping
2023-01-04 15:55:27.821789065 [V:onnxruntime:, session_state.cc:112 CreateGraphInfo] Done saving OrtValue mappings.
2023-01-04 15:55:27.821883530 [I:onnxruntime:, session_state_utils.cc:199 SaveInitializedTensors] Saving initialized tensors.
2023-01-04 15:55:27.821966558 [I:onnxruntime:, session_state_utils.cc:286 SaveInitializedTensors] [Memory] SessionStateInitializer statically allocates 22075136 bytes for OpenVINO_CPU
2023-01-04 15:55:27.841562759 [I:onnxruntime:, session_state_utils.cc:342 SaveInitializedTensors] Done saving initialized tensors
2023-01-04 15:55:27.841696043 [I:onnxruntime:, inference_session.cc:1488 Initialize] Session successfully initialized.
2023-01-04 15:55:27.968215795 [I:onnxruntime:, sequential_executor.cc:176 Execute] Begin execution
2023-01-04 15:55:28.563970520 [W:onnxruntime:, execution_frame.cc:828 VerifyOutputSizes] Expected shape from model of {-1,50,13,13} does not match actual shape of {1,50,12,22} for output model_outputs0
703.03 MS
[{'probability': 0.69322181, 'tagId': 2, 'tagName': 'MailTruck', 'boundingBox': {'left': 0.2459873, 'top': 0.50959683, 'width': 0.07348059, 'height': 0.11094462}}, {'probability': 0.1193537, 'tagId': 3, 'tagName': 'Other', 'boundingBox': {'left': 0.54340264, 'top': 0.60339437, 'width': 0.13081755, 'height': 0.17390769}}]
Describe the issue
In trying to accelerate an exported Custom Vision model, it seems to be over 300 % slower (2442 MS vs. 703 MS) when accelerated on a MyriadX VPU.
In the default/sample code from Custom Vision I have updated the InferenceSession line to:
self.session = onnxruntime.InferenceSession(temp, providers=['OpenVINOExecutionProvider'], provider_options=[{'device_type': 'MYRIAD_FP16'}])
To reproduce
python3 ./onnxruntime_predict.py sample.jpg
and note the output and performance around 2442 MS (shown below):python3 ./onnxruntime_predict.py sample.jpg
and note the 703 MS shown below:Urgency
low urgency, development machine
Platform
Linux
OS Version
Ubuntu 20.04
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.13.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
OpenVINO
Execution Provider Library Version
1.13.1