RUNTIME_EXCEPTION, 80070057 The parameter is incorrect in v1.17.3

Rikyf3 commented 7 months ago

Describe the issue

When running inference on DirectML EP I get the following error:

onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Add node. Name:'StatefulPartitionedCall/generator/tf.__operators__.add/AddV2' Status Message: D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2482)\onnxruntime_pybind11_state.pyd!00007FFDEA13C46F: (caller: 00007FFDEA13DC21) Exception(3) tid(1828) 80070057 The parameter is incorrect.

I have seen similar issues reported #19405 and #18666, but they seem to be solved updating to v1.17.1. I am using v1.17.3.

To reproduce

I am not able to reproduce the issue. One of our user reported the error. Our inference call is very basic with the addition of ort_options.execution_mode = ort.ExecutionMode.ORT_PARALLEL. We are inferencing a u-net model.

Urgency

Very urgent.

Platform

Windows

OS Version

win10 Pro, 10.0.19045

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.17.3

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU, DirectML

Execution Provider Library Version

No response

NarutoUA commented 7 months ago

Documentation states DirectML doesn't support parallel execution mode:

The DirectML execution provider does not support the use of memory pattern optimizations or parallel execution in onnxruntime. When supplying session options during InferenceSession creation, these options must be disabled or an error will be returned.

If creating the onnxruntime InferenceSession object directly, you must set the appropriate fields on the onnxruntime::SessionOptions struct. Specifically, execution_mode must be set to ExecutionMode::ORT_SEQUENTIAL, and enable_mem_pattern must be false.

Additionally, as the DirectML execution provider does not support parallel execution, it does not support multi-threaded calls to Run on the same inference session. That is, if an inference session using the DirectML execution provider, only one thread may call Run at a time. Multiple threads are permitted to call Run simultaneously if they operate on different inference session objects.

https://onnxruntime.ai/docs/execution-providers/DirectML-ExecutionProvider.html#configuration-options

Rikyf3 commented 7 months ago

Interesting... the code was tested on multiple windows pcs and we have never experienced errors. As stated above I am not able to reproduce the issue myself. As soon as I hear a feedback I will close the issue if the problem has been solved. Thanks!

fdwr commented 7 months ago

Interesting... the code was tested on multiple windows pcs and we have never experienced errors.

It's possible to get lucky, but that setting is unsupported.

Can you get more info using the graphics and DML debug layer? It's probably already installed you have Visual Studio installed, but otherwise: https://learn.microsoft.com/en-us/windows/ai/directml/dml-debug-layer. Then:

Start / Run / dxcpl.exe
Add your process .exe path to the list.
Then Force debug messages on.
You should see additional output in the Visual Studio Output window, a different debugger of your choice, or via DebugView.

Rikyf3 commented 6 months ago

I have noted that all the hardware involved even if it supports DirectX12 has a maximum feature level of 11_1. Does DirectML execution require feature level 12_0? I cannot find any reference to this in the documentation.

fdwr commented 5 months ago

Does DirectML execution require (D3D?) feature level 12_0?

@Rikyf3 The DML EP itself creates a DML device using D3D_FEATURE_LEVEL_11_0, if that answers your question. See https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/dml/dml_provider_factory.cc#L495.

I believe DMLCreateDevice works with D3D devices created with any of the following:

        D3D_FEATURE_LEVEL_1_0_GENERIC,
        D3D_FEATURE_LEVEL_1_0_CORE,
        D3D_FEATURE_LEVEL_11_0,
        D3D_FEATURE_LEVEL_11_1,
        D3D_FEATURE_LEVEL_12_0,
        D3D_FEATURE_LEVEL_12_1

microsoft / onnxruntime