Describe the issue

how to improve [Performance] in batch inference with multicore cpu( logical core, 8 threads/1 core)?

To reproduce

i just use the code bellow,is there other example code to improve [Performance] in batch inference? def create_model_for_provider(model_path: str, provider: str) -> onnxruntime.InferenceSession:

assert provider in onnxruntime.get_all_providers(), f"provider {provider} not found, {get_all_providers()}"

Few properties that might have an impact on performances (provided by MS)

options = onnxruntime.SessionOptions() options.intra_op_num_threads = 8 options.inter_op_num_threads = 8 options.execution_mode = onnxruntime.ExecutionMode.ORT_SEQUENTIAL options.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_ENABLE_ALL

Load the model as a graph and prepare the CPU backend

session = onnxruntime.InferenceSession(model_path, options, providers=[provider]) session.disable_fallback()

return session

Urgency

No response

Platform

Linux

OS Version

ubuntu20.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.13.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

microsoft / onnxruntime

hello,how to improve [Performance] in batch inference with multicore cpu #13820

Describe the issue

To reproduce

Few properties that might have an impact on performances (provided by MS)

Load the model as a graph and prepare the CPU backend

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?