Open vymao opened 1 year ago
Thanks, but I still wonder about the second part of my question: Some Stack Overflow posts (like this) and other related issues (https://github.com/microsoft/onnxruntime/issues/12654) suggest that it is possible to create a global threadpool shared across sessions, but I wonder which approach is more efficient.
Hi @pranavsharma, just following up on this?
It would be nice if you could access the global thread pool inside Onnxruntime and wrap a task system around it using something like https://github.com/Naios/continuable Then you could run all your ONNX models concurrently in parallel using a single shared thread pool. That would be sweet
I'd be interested in hearing a more complete answer to this question, is this the recommended way of dealing with multiple models?
It was just a recommendation. I use the onnxruntime shared thread pool for all my models, but I want to go a step further and launch both models asynchronously on that same shared pool as well. currently, you have to launch them on your own threads.
I've just noticed that Ort::Session
now has the RunAsync()
function which runs on ORT's intra-op thread pool. That might solve your problem
it would still be cool if you could explicitly access that thread pool and roll out your own task system so your whole program uses a single pool shared by all models and any other concurrent code you need to run.
Describe the issue
I want to run 2 or 3 ONNX models in C++ simultaneously. The only way that seems recommended to do this is to create one session per model, each in a different thread or process. Then we somehow map the thread affinities to particular cores to avoid contention.
I'm not 100% sure how to do this, but based on what I could fine, I assume the following:
Ort::Env
,Ort::SessionOptions
, andOrt::Session
Ort::Env
like so. The only way I have found in doing this is to useSetGlobalIntraOpThreadAffinity
, which seems to indicate that I need to create separateOrt::Env
s to do this globally within the env. I did not find any affinity methods in theOrt::SessionOptions
API.Is this correct? And is there a more efficient way of doing this? Some Stack Overflow posts (like this) and other related issues (like this) suggest that it is possible to create a global threadpool shared across sessions, but I'm not sure how to configure that, or if it is more efficient.
To reproduce
N/A
Urgency
No response
Platform
Mac
OS Version
13.5
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.16.2
ONNX Runtime API
C++
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response