Open josephrocca opened 1 year ago
opset.ts:47 Uncaught (in promise) TypeError: cannot resolve operator 'DynamicQuantizeLinear' with opsets: ai.onnx v13 at t.resolveOperator (opset.ts:47:1) at t.WebGLSessionHandler.resolve (session-handler.ts:81:1) at t.Session.initializeOps (session.ts:242:1) at session.ts:93:1 at t.Profiler.event (instrument.ts:337:1) at t.Session.initialize (session.ts:89:1) at session.ts:71:1
Facing same issue with above.
@josephrocca It says all the quantization layer in onnxruntime do not support with WebGL backend. Did you solve this problem? I think we may self-generate our own operation in low-level language or ts/js
@SangbumChoi I didn't solve it - I'm wondering if it's possible to "dequantize" it on the client. The main reason I want the model quantized is to reduce the time it takes for the client to download it.
But the problem is the WebGL backend is just missing a lot of ops compared to Wasm backend. The unquantized version of my model also has Erf
- just like yours, apparently, which the WebGL backend doesn't support.
Like I said in my first comment on this issue, I hope that WebGPU will solve these compatibility problems by just compiling the native GPU code to WGSL, so we have ~full op support without lots of burden on the ONNX Runtime Web team.
Describe the issue
When using the WebGL backend with this model, I get the following error:
Note that I had to use opset 16 because PyTorch ONNX export doesn't support 17. Note also, that the Wasm back-end works fine, as usual. I'm not sure how committed the team is to improving WebGL op support, but I'll just note that it's currently pretty rare that I'm able to get the WebGL backend working due to lack of op support.
Perhaps the team is waiting for WebGPU to land in browsers (probably early next year?) to put more effort into GPU inference on the web? I'm hoping that with WebGPU, the ONNX Runtime team will be able to "automatically" port their native GPU kernels to WGSL, just like they port the native CPU kernels to Wasm. IIUC, WGSL is specced with ease-of-porting/transpilation from native GPU formats in mind? If a manual rewrite for all the WebGPU kernels is required, then I'm worried that the WebGPU backend will forever have patchy support compared to the wasm backend.
To reproduce
https://jsbin.com/daginihoho/edit?html,output
Urgency
No hard deadlines.
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.13.1
Execution Provider
WebGL