microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.66k stars 2.93k forks source link

[Web] WebGPU backend fails to load some model due to exception during initialization inside transpose optimizer #15869

Closed gegogi closed 5 months ago

gegogi commented 1 year ago

Describe the issue

I am trying to load a model on WebGPU backend env. I could load the model downloaded from: https://github.com/onnx/models/blob/main/vision/classification/mobilenet/model/mobilenetv2-12.onnx But I couldn't load the following model: https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/onnx/vae_encoder Both models can be loaded using Python onnxruntime.

To reproduce

Download the model from: https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/onnx/vae_encoder and run the following code:

const ort = require('onnxruntime-web/webgpu');
async function main() {
        const modelPath = './models/sd15_vae_encoder_model.onnx';
        const session = await ort.InferenceSession.create(modelPath, {executionProviders: ['webgpu']});
}

Urgency

No response

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

onnxruntime-web@1.16.0-dev.20230508-045c623415

Execution Provider

Other / Unknown

gegogi commented 1 year ago

FYI, loading still fails even after conversion to .ort format.

fs-eire commented 1 year ago

I will take a look

visheratin commented 1 year ago

The most likely reason is that the VAE encoder graph has operators that are not yet supported by the WebGPU execution provider, e.g., InstanceNormalization, Slice, Reshape.

fs-eire commented 1 year ago

The operator coverage is a problem, but that should not cause the model loading failure. After debugging the issue I found the problem is in the transpose optimizer.

 C:\a\_work\1\s\onnxruntime\core\optimizer\transpose_optimizer\optimizer_api_impl.cc:280 virtual std::vector<uint8_t> onnxruntime::ApiTensor::Data() const [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : 
Yi @ ort.webgpu.min.js:6

Need dig deeper into the source code. I am debugging it.

fs-eire commented 1 year ago

@gegogi This issue should have been fixed by the PR mentioned above. Please help to validate if it works. thanks

gegogi commented 1 year ago

Could you publish the latest nightly npm build? I tried to build onnxruntime myself but could't figure out compilation errors relating to protobuf version mismatch. It seems the project has the protobuf as a submodule but is trying to include headers from system directory which have different signatures.

image

fs-eire commented 1 year ago

Please try onnxruntime-web@1.16.0-dev.20230521-0204594f90

gabrielgrant commented 5 months ago

This appears to be fixed for me when running this example: https://gist.github.com/gabrielgrant/cb3e072dec5a416b4fc24f18ae902fb7

...but, despite using ort.webgpu.min.js and only having executionProviders: ['webgpu'] , it is still demanding that ort.env.wasm.wasmPaths be set, so it's not entirely clear to me that it's actually using the WebGPU backend instead of WASM? (is the WASM bundle just needed as fallback for kernels not yet implemented in WebGPU?)

@gegogi are you able to confirm this is fixed? (this should be in a release now)

@fs-eire:

  1. Can you confirm the gist example I've put together should be testing the issue correctly?
  2. are you confident enough that #15988 fixes this to close this issue?
fs-eire commented 5 months ago

The ONNX Runtime Web depends on the C++ code for session, graph and model execution, which is compiled into WebAssembly. In short, ONNX Runtime Web always need to load WebAssembly, no matter you use webgpu or wasm(cpu) EP.

However, you don't have to always set ort.env.wasm.wasmPaths. If it is not set, it will try to load the .wasm files from the "current folder" (relative to the URL of the JavaScript file that is currently running). the flag just offers a way to customize the path.

The issue that related to "Transpose" is already fixed. So let me close the issue.