how to improve [Performance] in batch inference with multicore cpu( logical core, 8 threads/1 core)?
To reproduce
i just use the code bellow,is there other example code to improve [Performance] in batch inference?
def create_model_for_provider(model_path: str, provider: str) -> onnxruntime.InferenceSession:
assert provider in onnxruntime.get_all_providers(), f"provider {provider} not found, {get_all_providers()}"
Few properties that might have an impact on performances (provided by MS)
Describe the issue
how to improve [Performance] in batch inference with multicore cpu( logical core, 8 threads/1 core)?
To reproduce
i just use the code bellow,is there other example code to improve [Performance] in batch inference? def create_model_for_provider(model_path: str, provider: str) -> onnxruntime.InferenceSession:
assert provider in onnxruntime.get_all_providers(), f"provider {provider} not found, {get_all_providers()}"
Few properties that might have an impact on performances (provided by MS)
options = onnxruntime.SessionOptions() options.intra_op_num_threads = 8 options.inter_op_num_threads = 8 options.execution_mode = onnxruntime.ExecutionMode.ORT_SEQUENTIAL options.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_ENABLE_ALL
Load the model as a graph and prepare the CPU backend
session = onnxruntime.InferenceSession(model_path, options, providers=[provider]) session.disable_fallback()
return session
Urgency
No response
Platform
Linux
OS Version
ubuntu20.04
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.13.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Model File
No response
Is this a quantized model?
no