microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.13k stars 2.85k forks source link

[Web] Having trouble loading a model and creating a session #14583

Closed nezaBacar closed 1 year ago

nezaBacar commented 1 year ago

Describe the issue

Hello, I am trying to load an unet model using onnxruntime-web. I am attempting to load it into React.js environment using the following code: const ort = require('onnxruntime-web'); const unet = "./model/unet/model.onnx" const unet_session = ort.InferenceSession.create(unet)

When loading this model I get the warning:

wasm-core-impl.ts:48 Uncaught (in promise) Error: Can't create a session at t.createSession (wasm-core-impl.ts:48:1) at t.createSession (proxy-wrapper.ts:141:1) at t.OnnxruntimeWebAssemblySessionHandler.loadModel (session-handler.ts:36:1) at Object.createSessionHandler (backend-wasm.ts:66:1) at async InferenceSession.create (inference-session-impl.ts:189:1)

Using this same method and code I've been able to load other models without errors and warnings. Is there anything I can do to investigate why I'm running into this issue?

To reproduce

Load onnx models from this link: https://drive.google.com/drive/folders/1MM-b-5sUUNlQVhIqSnq4BhEwRaU1rMFV?usp=share_link The one that does not work is in the unet folder.

This is tiny stable diffusion model from this link transformed to onnx with hugging-face script: https://huggingface.co/hf-internal-testing/tiny-stable-diffusion-torch. I am using it in order to try to setup a stable diffusion web inference pipeline. Since I don't have problems loading other models I tried to load some other stable diffusion models I found online for example: https://huggingface.co/ShadowPower/waifu-diffusion-v1-3-onnx I had the same issue with this one (only the unet.onnx), so I assume there is a problem with stable diffusion models.

Urgency

It's somewhat urgent.

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.13.1 (onnxruntime-web)

Execution Provider

WASM

shalvamist commented 1 year ago

Hi nezaBacar,

Thanks for the detailed report of the issue. I was able to reproduce the problem on my end. A quick possible bypass you might want to use is to convert the model to ORT format.

This can be done using this command line - mode details can be found here link python -m onnxruntime.tools.convert_onnx_models_to_ort model.onnx --optimization_style Fixed

I tested it on my end and can create a session with the ort model in the browser. Hope this will unblock you for now. I will keep investigating the issue and update you with the results.

guschmue commented 1 year ago

btw, the issue is that onnxruntime-web currently does not support the onnx external data format (aka the graph is in the onnx file and weights are in seperate files). Converting it to a .ort file just puts model and weights into the same file.

nezaBacar commented 1 year ago

Okay I understand, I tried to convert it to ORT format and it works. Thank you both for the fast response

guschmue commented 1 year ago

interesting model - we'd love to see this work in the browser. Let us know if you run into more issues.

I expect that even with this smaller model you will run into perf issues with wasm. You can squeeze a little more out of wasm by enabling multiple threads with:

    --enable-features=SharedArrayBuffer in the chrome command line

or by sending the following headers from the server:

     self.send_header("Cross-Origin-Opener-Policy", "same-origin")
     self.send_header("Cross-Origin-Embedder-Policy", "require-corp")

In your source you can than request the number of threads you want:

ort.env.wasm.numThreads = 8;
nezaBacar commented 1 year ago

Thank you for the response. I was able to resolve the problem. This is actually a smaller version of stable diffusion model, I am trying to recreate the inference on the browser side and then test it with an actual model. I expect to have some performance issues, just trying to test the limits. Thank you so much for additional tips.

nezaBacar commented 1 year ago

btw, the issue is that onnxruntime-web currently does not support the onnx external data format (aka the graph is in the onnx file and weights are in seperate files). Converting it to a .ort file just puts model and weights into the same file.

Hi, I am asking for some help again. Is there any other way to combine the weights and the graph into the same file? I am trying to convert the original stable diffusion model v1.4 but there is Protobuf/Flat Buffer limit of 2Gb and the weights are 3Gb+.

Sergyoubi commented 8 months ago

I had the same issue in my React app, but I've fixed it! I use Vite as front-end tooling! Here is what i've done:

export default defineConfig({ plugins: [ react(), viteStaticCopy({ targets: [ { src: "node_modules/onnxruntime-web/dist/*.wasm", dest: ".", }, ], }), ], });

==> InferenceSession.create("/model/my_model.onnx"),