[Performance] Can oneDNN EP accelerate the inference time of onnxruntime on x86 machines?

microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

https://onnxruntime.ai

MIT License

13.96k stars 2.82k forks source link

[Performance] Can oneDNN EP accelerate the inference time of onnxruntime on x86 machines? #14749

Open sanbuphy opened 1 year ago

sanbuphy commented 1 year ago

Describe the issue

I would like to ask the difference between the default CPU EP and oneDNN EP, whether the oneDNN EP can accelerate inference time at the operator level?

I tried openvino EP, but it doesn't work well for dynamic shapes input such as nlp tasks, not as good as the default CPU EP.

To reproduce

I want to speed up the effect of onnxruntime on x86 machine

Urgency

No response

Platform

Linux

OS Version

20.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

lasted

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

Thank you very much.

nickchomey commented 1 year ago

I think oneDNN is included within OpenVINO (https://github.com/openvinotoolkit/openvino/issues/466). Using it directly from onnxruntime seems to require doing your own build - no idea if it would make a difference though... I hope this helps

sanbuphy commented 1 year ago

I think oneDNN is included within OpenVINO (openvinotoolkit/openvino#466). Using it directly from onnxruntime seems to require doing your own build - no idea if it would make a difference though... I hope this helps

Hello,thank you very much. I built it myself and I found that the speed is slower after using oneDNN EP, but using native CPUEP is faster than before. I think this is because the compiled version 1.15.0 of onnxruntime has a better graph optimization strategy or operator-level optimization than 1.14.

sfatimar commented 1 year ago

there is an option to enable dynamic_shapes for faster performance

sanbuphy commented 1 year ago

there is an option to enable dynamic_shapes for faster performance

Hi, can you share more information about this ? thank you very much.

sfatimar commented 1 year ago

https://onnxruntime.ai/docs/execution-providers/OpenVINO-ExecutionProvider.html. Through provider options in both C++ and Python API . https://onnxruntime.ai/docs/execution-providers/OpenVINO-ExecutionProvider.html#summary-of-options

enable_dynamic_shapes | string | True/False | boolean | This option if enabled works for dynamic shaped models whose shape will be set dynamically based on the infer input image/data shape at run time in CPU. This gives best result for running multiple inferences with varied shaped images/data.