Open xenova opened 1 month ago
cc @guschmue for viz
The root cause is ORT WebGPU only implements maxpool2d. But this model requires maxpool1d, then fails with the check "kernelShape.length !== inputDims.length - 2" in adjustPoolAttributes().
I think there are more issues with pyannotate after the first maxpool, The Conv1D behind that maxpool is messing up its output shape as far I can tell. But worse with pyannotate: there are LSTMs behind the first 2 layers - don't think we have plans to support LSTM in webgpu.
And the fix for maxpool looks good.
Describe the issue
Attempting to run this PyAnnote segmentation model with WebGPU produces the following error:
Running with WASM produces the correct result.
To reproduce
minimum reproduction: https://jsfiddle.net/omgkb3n1/1/
Urgency
Prevents use of this model in Transformers.js
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.18.0
Execution Provider
'webgpu' (WebGPU)