microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.1k stars 2.84k forks source link

Profiling multithreaded runs #18600

Open Archie3d opened 9 months ago

Archie3d commented 9 months ago

Describe the issue

When I enable profiling:

    Ort::SessionOptions sessionOptions{};
    sessionOptions.SetIntraOpNumThreads(4);
    sessionOptions.SetInterOpNumThreads(4);
    sessionOptions.SetExecutionMode(ORT_PARALLEL);
    sessionOptions.EnableProfiling(L"profile");

The profile file gets correctly generated, however the inference falls back to a single thread only, and the profiling trace shows just a single thread. I would like to be able to profile multithreaded execution.

Here in this trace I expect a single call to session Run() to me parallelized across 4 threads, instead the one stays on the same thread: image

To reproduce

Enable profiling and multiple inter-ops threads in the session options.

Urgency

No response

Platform

Windows

OS Version

10

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16.3

ONNX Runtime API

C++

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

No

yuslepukhin commented 9 months ago

InterOp threads are only used with a parallel executor, which you are not using. IntraOp refers to internal kernel parallelization. The profiling measures only kernel execution start to end.

Archie3d commented 9 months ago

Well, in fact I do, I do call

sessionOptions.SetExecutionMode(ORT_PARALLEL);

but the entire model is still executed on a single thread. My model is composed of four independent chunks, I expect it to run in parallel on four threads.