microsoft / Windows-Machine-Learning

Samples and Tools for Windows ML.
https://docs.microsoft.com/en-us/windows/ai/
MIT License
1.03k stars 446 forks source link

Crash at DmlCommandRecorder::ValidateRecordDispatch #523

Open venki-thiyag opened 1 year ago

venki-thiyag commented 1 year ago

Crash was observed on Windows system WinML processing [Windows 10.0.19045]

Callstack is pointing to the following:

    DirectML.dll!DmlCommandRecorder::ValidateRecordDispatch(struct ID3D12CommandList *,struct IDMLDispatchable *,struct IDMLBindingTable *) Unknown Non-user code. Symbols loaded.
>   DirectML.dll!DmlCommandRecorder::RecordDispatch(struct ID3D12CommandList *,struct IDMLDispatchable *,struct IDMLBindingTable *) Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!Dml::DmlCommandRecorder::ExecuteOperator(struct IDMLCompiledOperator *,struct DML_BINDING_DESC const &,class gsl::span<struct DML_BINDING_DESC const ,-1>,class gsl::span<struct DML_BINDING_DESC const ,-1>)    Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!Dml::ExecutionProviderImpl::ExecuteOperator(struct IDMLCompiledOperator *,struct DmlBindingDesc const &,class gsl::span<struct DML_BINDING_DESC,-1>,class gsl::span<struct DML_BINDING_DESC,-1>) Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!Dml::ExecutionProviderImpl::ExecuteOperator(struct IDMLCompiledOperator *,struct DmlBindingDesc const &,class gsl::span<struct IMLOperatorTensor *,-1>,class gsl::span<struct IMLOperatorTensor *,-1>)   Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!Dml::DmlOperator::Compute(class MLOperatorKernelContext const &) Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!MLOperatorKernel<class Dml::DmlOperatorActivationTemplate<35> >::Compute(struct IMLOperatorKernelContext *)  Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!winrt::Windows::AI::MachineLearning::implementation::AbiOpKernel::Compute(class onnxruntime::OpKernelContext *)  Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!onnxruntime::SequentialExecutor::Execute(class onnxruntime::SessionState const &,class std::vector<int,class std::allocator<int> > const &,class std::vector<struct OrtValue,class std::allocator<struct OrtValue> > const &,class std::vector<int,class std::allocator<int> > const &,class std::vector<struct OrtValue,class std::allocator<struct OrtValue> > &,class std::unordered_map<unsigned __int64,class std::function<class onnxruntime::common::Status >,struct std::hash<unsigned __int64>,struct std::equal_to<unsigned __int64>,class std::allocator<struct std::pair<unsigned __int64 const ,class std::function<class onnxruntime::common::Status > > > > const &,class onnxruntime::logging::Logger const &)   Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!onnxruntime::utils::ExecuteGraph(class onnxruntime::SessionState const &,class onnxruntime::FeedsFetchesManager &,class std::vector<struct OrtValue,class std::allocator<struct OrtValue> > const &,class std::vector<struct OrtValue,class std::allocator<struct OrtValue> > &,class std::unordered_map<unsigned __int64,class std::function<class onnxruntime::common::Status >,struct std::hash<unsigned __int64>,struct std::equal_to<unsigned __int64>,class std::allocator<struct std::pair<unsigned __int64 const ,class std::function<class onnxruntime::common::Status > > > > const &,bool,bool const &,class onnxruntime::logging::Logger const &,bool)   Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!onnxruntime::InferenceSession::Run(struct OrtRunOptions const &,class std::vector<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >,class std::allocator<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > > > const &,class std::vector<struct OrtValue,class std::allocator<struct OrtValue> > const &,class std::vector<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >,class std::allocator<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > > > const &,class std::vector<struct OrtValue,class std::allocator<struct OrtValue> > *) Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!winrt::Windows::AI::MachineLearning::implementation::LearningModelSession::Run(struct winrt::com_ptr<struct winrt::Windows::AI::MachineLearning::implementation::LearningModelBinding>)  Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!winrt::Windows::AI::MachineLearning::implementation::LearningModelSession::EvaluateAsync$_ResumeCoro$2() Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!winrt::Windows::AI::MachineLearning::implementation::LearningModelSession::EvaluateAsync$_InitCoro$1()   Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!winrt::Windows::AI::MachineLearning::implementation::LearningModelSession::EvaluateAsync(struct winrt::Windows::AI::MachineLearning::LearningModelBinding,struct winrt::hstring) Unknown Non-user code. Symbols loaded.
    Windows.AI.MachineLearning.dll!winrt::impl::produce<struct winrt::Windows::AI::MachineLearning::implementation::LearningModelSession,struct winrt::Windows::AI::MachineLearning::ILearningModelSession>::EvaluateAsync(void *,void *,void * *)  Unknown Non-user code. Symbols loaded.
    [Frames may be missing, no binary loaded for RCVNativeVBG.dll]      Annotated Frame
    RCVNativeVBG.dll!00007ffe36f1c559() Unknown Non-user code. No matching binary found.

Environment

Windows Build Number: Windows 10.0.19045

WinML version: DirectML.dll 1.0.200713-1013.1.vb.07142e1

Crash dump is attached: winml_crash_dump.zip

nums11 commented 1 year ago

It looks like you're running Win10. Have you tried using the Nuget? We've made a lot of improvements there that may address your issue.