microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.13k stars 2.85k forks source link

hello,how to improve [Performance] in batch inference with multicore cpu #13820

Open CasonTsai opened 1 year ago

CasonTsai commented 1 year ago

Describe the issue

how to improve [Performance] in batch inference with multicore cpu( logical core, 8 threads/1 core)?

To reproduce

i just use the code bellow,is there other example code to improve [Performance] in batch inference? def create_model_for_provider(model_path: str, provider: str) -> onnxruntime.InferenceSession:

assert provider in onnxruntime.get_all_providers(), f"provider {provider} not found, {get_all_providers()}"

Few properties that might have an impact on performances (provided by MS)

options = onnxruntime.SessionOptions() options.intra_op_num_threads = 8 options.inter_op_num_threads = 8 options.execution_mode = onnxruntime.ExecutionMode.ORT_SEQUENTIAL options.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_ENABLE_ALL

Load the model as a graph and prepare the CPU backend

session = onnxruntime.InferenceSession(model_path, options, providers=[provider]) session.disable_fallback()

return session

Urgency

No response

Platform

Linux

OS Version

ubuntu20.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.13.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

no

pranavsharma commented 1 year ago

I assume you've read this? https://onnxruntime.ai/docs/performance/tune-performance.html