microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.23k stars 2.87k forks source link

Read access violation under OnnxRuntimeCpuSessionBuilder::Initialize during WinML operator tests for function operators #14810

Open maggie1059 opened 1 year ago

maggie1059 commented 1 year ago

Describe the issue

This is hitting a customer model which we don't have access to, but the same stack is also hit in the WinML ONNX conformance tests. We're seeing read access violation errors for several operators after updating our submodule to more recent ORT commits, and the pattern appears across operators with functional decompositions. I'm getting the following output from the HardSwish test, for CPU:

StartGroup: HardSwish-14_float16.WinML_Cpu
TAEF: A crash with exception code 0xC0000005 occurred in module "onnxruntime.dll" in process "te.processhost.exe" (pid:32068).
Error: TAEF: [HRESULT 0x800706BE] A failure occurred while running a test operation: 'OnnxConformanceTestsTaef::HardSwish'. (A crash with exception code 0xC0000005 occurred in module "onnxruntime.dll" in the process hosting the test code while invoking a test operation.)
Error: Wex.Logger Mismatched Group: A log grouping named "OnnxConformanceTestsTaef::HardSwish" was ended when the active group was "HardSwish-14_float16.WinML_Cpu".
EndGroup: Wex.Logger Mismatched Group: HardSwish-14_float16.WinML_Cpu [Failed]
EndGroup: OnnxConformanceTestsTaef::HardSwish [Failed]

with the following call stack: image

I'm seeing this for the following tests:

To reproduce

This is the model for the HardSwish test described above: HardSwish.zip

Running onnxruntime_perf_test.exe -I -r 1 -e cpu HardSwish.onnx outputs the following stack trace:

Stacktrace:
C:\Users\wumaggie.NORTHAMERICA\Desktop\ai\onnxruntime\onnxruntime\test\onnx\onnx_model_info.cc(23): OnnxModelInfo::OnnxModelInfo
C:\Users\wumaggie.NORTHAMERICA\Desktop\ai\.pkg\msucrt.amd64.10.0.25301.1000-230213-1216.rs-es\Public\onecore\internal\sdk\inc\ucrt\stl120\memory(3433): std::make_unique<OnnxModelInfo,wchar_t const * &,0>
C:\Users\wumaggie.NORTHAMERICA\Desktop\ai\onnxruntime\onnxruntime\test\onnx\TestCase.cc(269): TestModelInfo::LoadOnnxModel
C:\Users\wumaggie.NORTHAMERICA\Desktop\ai\onnxruntime\onnxruntime\test\perftest\performance_runner.cc(250): onnxruntime::perftest::CreateModelInfo
C:\Users\wumaggie.NORTHAMERICA\Desktop\ai\onnxruntime\onnxruntime\test\perftest\performance_runner.cc(284): onnxruntime::perftest::PerformanceRunner::PerformanceRunner
C:\Users\wumaggie.NORTHAMERICA\Desktop\ai\onnxruntime\onnxruntime\test\perftest\main.cc(45): real_main
C:\Users\wumaggie.NORTHAMERICA\Desktop\ai\onnxruntime\onnxruntime\test\perftest\main.cc(64): wmain
VCCRT\vcstartup\src\startup\exe_common.inl(91): invoke_main
VCCRT\vcstartup\src\startup\exe_common.inl(288): __scrt_common_main_seh
VCCRT\vcstartup\src\startup\exe_common.inl(331): __scrt_common_main
VCCRT\vcstartup\src\startup\exe_wmain.cpp(17): wmainCRTStartup
???: BaseThreadInitThunk
???: RtlUserThreadStart

Urgency

No response

Platform

Windows

OS Version

11

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.14.0 (commit #14456)

ONNX Runtime API

WinML

Architecture

X64

Execution Provider

CPU

Execution Provider Library Version

No response

fdwr commented 1 year ago

Removed DML tag because this isn't DML-specific, occurring on CPU-only execution.