microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.55k stars 2.91k forks source link

Fails to load this unet.onnx from Stable Diffusion (DirectML problem?) #14268

Closed elephantpanda closed 1 year ago

elephantpanda commented 1 year ago

Describe the issue

I tried to create an inference session unet.onnx from here I get an error (The other onnx from this link work fine):

When trying to load with DirectML I get the error: ( The parameter is incorrect.)

Microsoft.ML.OnnxRuntime.OnnxRuntimeException: [ErrorCode:RuntimeException] Exception during initialization: D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\GraphKernelHelper.cpp(78)\onnxruntime.dll!00007FFA23FD6CA3: (caller: 00007FFA23FD7445) Exception(3) tid(46b0) 80070057 The parameter is incorrect.

  at Microsoft.ML.OnnxRuntime.NativeApiStatus.VerifySuccess (System.IntPtr nativeStatus) [0x0002c] in <1e02e1e264404d07be919d65b1f5ddc7>:0 
  at Microsoft.ML.OnnxRuntime.InferenceSession.Init (System.String modelPath, Microsoft.ML.OnnxRuntime.SessionOptions options, Microsoft.ML.OnnxRuntime.PrePackedWeightsContainer prepackedWeightsContainer) [0x0002f] in <1e02e1e264404d07be919d65b1f5ddc7>:0 
  at Microsoft.ML.OnnxRuntime.InferenceSession..ctor (System.String modelPath, Microsoft.ML.OnnxRuntime.SessionOptions options) [0x0002c] in <1e02e1e264404d07be919d65b1f5ddc7>:0 
  at ONNXRuntimeTest.CreateNamedSession (System.String path) [0x0004a] in C:\Users\Shadow\Dropbox\My Stable Diffusion\Assets\ONNXRuntimeTest.cs:84 
UnityEngine.Debug:Log (object)
ONNXRuntimeTest:CreateNamedSession (string) (at Assets/ONNXRuntimeTest.cs:91)
ONNXRuntimeTest:Unet () (at Assets/ONNXRuntimeTest.cs:124)
ONNXRuntimeTest:GO () (at Assets/ONNXRuntimeTest.cs:100)
UnityEngine.EventSystems.EventSystem:Update () (at Library/PackageCache/com.unity.ugui@1.0.0/Runtime/EventSystem/EventSystem.cs:377)

The session loads in CPU mode. But when I run the inference I get:

Microsoft.ML.OnnxRuntime.OnnxRuntimeException: [ErrorCode:RuntimeException] Non-zero status code returned while running Mul node. Name:'Mul_187' Status Message: D:\a\_work\1\s\onnxruntime\core/providers/cpu/math/element_wise_ops.h:503 onnxruntime::BroadcastIterator::Init axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 2 by 160

  at Microsoft.ML.OnnxRuntime.NativeApiStatus.VerifySuccess (System.IntPtr nativeStatus) [0x0002c] in <1e02e1e264404d07be919d65b1f5ddc7>:0 
  at Microsoft.ML.OnnxRuntime.InferenceSession.RunImpl (Microsoft.ML.OnnxRuntime.RunOptions options, System.IntPtr[] inputNames, System.IntPtr[] inputValues, System.IntPtr[] outputNames, Microsoft.ML.OnnxRuntime.DisposableList`1[T] cleanupList) [0x0004c] in <1e02e1e264404d07be919d65b1f5ddc7>:0 
  at Microsoft.ML.OnnxRuntime.InferenceSession.Run (System.Collections.Generic.IReadOnlyCollection`1[T] inputs, System.Collections.Generic.IReadOnlyCollection`1[T] outputNames, Microsoft.ML.OnnxRuntime.RunOptions options) [0x00061] in <1e02e1e264404d07be919d65b1f5ddc7>:0 
  at Microsoft.ML.OnnxRuntime.InferenceSession.Run (System.Collections.Generic.IReadOnlyCollection`1[T] inputs, System.Collections.Generic.IReadOnlyCollection`1[T] outputNames) [0x00001] in <1e02e1e264404d07be919d65b1f5ddc7>:0 
  at Microsoft.ML.OnnxRuntime.InferenceSession.Run (System.Collections.Generic.IReadOnlyCollection`1[T] inputs) [0x00025] in <1e02e1e264404d07be919d65b1f5ddc7>:0 
  at ONNXRuntimeTest.RunIt2 (System.Collections.Generic.List`1[T] inputs) [0x0000a] in C:\Users\Shadow\Dropbox\My Stable Diffusion\Assets\ONNXRuntimeTest.cs:476 
UnityEngine.Debug:Log (object)
ONNXRuntimeTest:RunIt2 (System.Collections.Generic.List`1<Microsoft.ML.OnnxRuntime.NamedOnnxValue>) (at Assets/ONNXRuntimeTest.cs:480)
ONNXRuntimeTest:Unet () (at Assets/ONNXRuntimeTest.cs:140)
ONNXRuntimeTest:GO () (at Assets/ONNXRuntimeTest.cs:105)
UnityEngine.EventSystems.EventSystem:Update () (at Library/PackageCache/com.unity.ugui@1.0.0/Runtime/EventSystem/EventSystem.cs:377)

My nuget packages are: DirectML.1.10.0 OnnxRuntime.DirectML.1.13.1 OnnxRuntime.Managed.1.13.1

using netstandard2.0

I am running it on a Shadow PC with Nvidia Quadro P5000. I am using the Unity 3D engine. (12GB RAM)

From looking at other similar problems, it seems like it may be a problem with the onnxruntime.dll for DirectML. That is my guess.

I'm assuming the onnx is fine because it is four months old.

To reproduce

Using the onnx managed runtime for c#. Try to create InferenceSession for unet.onnx.

BTW I am using Unity if that makes a difference. It shouldn't.

Urgency

moderate.

Platform

Windows

OS Version

Windows 10

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.13.1

ONNX Runtime API

C#

Architecture

X64

Execution Provider

DirectML

Execution Provider Library Version

1.10.0

yuslepukhin commented 1 year ago

Please, take a look here first.

elephantpanda commented 1 year ago

Please, take a look here first.

Hi thanks, No I don't think this is the problem. It is definitely finding the latest DirectML.dll version 1.10.0 (because before I put that dll in the folder none of the other onnx's would load either).

All the other onnx's work fine with DirectML, it is just this specific unet.onnx which fails.

Seems more similar to this problem and this problem although neither are quite the same.

Is this more likely to be a problem with onnxruntime.dll or the onnx file itself? Could it have been exported wrongly with the wrong dimensions for the input tensors??? Although people claimed to have built it using openvino.

Edit: I have converted by own stable diffusion onnx files from the huggingface diffuser files. And it seems to work.

So it could mean that those other onnx files just don't work. I don't know.

fdwr commented 1 year ago

Paul, I know Pat's been working on a number of improvements/issues lately. Does Stable Diffusion 1.5 with ort-nightly-directml 1.15.0.dev20230301008 work for you? I ran this version https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/onnx/unet locally (NVidia GeForce RTX 3070 Ti) after converting weights to float16, and used the diffusers pipeline mentioned here https://www.travelneil.com/stable-diffusion-windows-amd.html.

image

elephantpanda commented 1 year ago

Paul, I know Pat's been working on a number of improvements/issues lately. Does Stable Diffusion 1.5 with ort-nightly-directml 1.15.0.dev20230301008 work for you? I ran this version https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/onnx/unet locally (NVidia GeForce RTX 3070 Ti) after converting weights to float16, and used the diffusers pipeline mentioned here https://www.travelneil.com/stable-diffusion-windows-amd.html.

image

Hi I got it working by converting the files to onnx using the python script instead of using the onnx files on hugging face.

Then I converted the python code to c sharp to run it