multi-thread demo - Githubissues

fdwr / OnnxRuntimeDirectMLEPSample

Little console app to run an ONNX model through ONNX Runtime via the DirectML execution provider.

Creative Commons Zero v1.0 Universal

18 stars 6 forks source link

multi-thread demo #3

Closed wunianqing closed 1 year ago

wunianqing commented 1 year ago

The sample project is very useful. Just with small modifications, we can make upload, run, and download parallel.

My question is, can we infer multi-data with one session in multiple threads? This is very useful when the model is small. I already know that we can create one session for one thread. But the session itself consumes memory.

fdwr commented 1 year ago

I am unsure the thread safety of an ORT session without looking through the docs or asking others, but I doubt this is safe. You may however be able to mitigate much of that session cost by creating distinct sessions per thread but sharing the larger tensors between them all. I forget the API function (and I'm using my phone right now), but there is one in the onnxruntime C API to do so.

wunianqing commented 1 year ago

Thank you very much for your response! I will look for the C API you mentioned and try it. It would be appreciated if you could post the API name here later.

wunianqing commented 1 year ago

I think the API is EnableMemPattern(). But according to this, DML EP does not support it.

fdwr commented 1 year ago

This is what I saw before: https://onnxruntime.ai/docs/api/c/struct_ort_api.html#a0dcdc66ac26c5d9aae1ccadf09f059fc