microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.77k stars 2.94k forks source link

[Performance] how to set the threads when using TRT EP #22913

Open noahzn opened 22 hours ago

noahzn commented 22 hours ago

Describe the issue

I notice multiple threads when using ONNXRUNTIME (TRT EP). Is this a normal behavior? Image

From the documentation it says:

Set number of intra-op threads
Onnxruntime sessions utilize multi-threading to parallelize computation inside each operator.

By default with intra_op_num_threads=0 or not set, each session will start with the main thread on the 1st core (not affinitized). Then extra threads per additional physical core are created, and affinitized to that core (1 or 2 logical processors).

I'm using TRT EP, although in providers I also include CPUExecutionProvider and CUDAExecutionProvider. How can I set number of threads for TRT EP? Thanks.

To reproduce

No code can be provided.

Urgency

No response

Platform

Other / Unknown

OS Version

JetPack=5.1.2

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

onnxruntime-gpu=1.17.0

ONNX Runtime API

Python

Architecture

ARM64

Execution Provider

TensorRT

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

No