[WebGPU] `Error: [WebGPU] Kernel "[MaxPool] /sincnet/pool1d.0/MaxPool" failed. Error: length of specified kernel shapes should be 2 less than length of input dimensions`

xenova commented 1 month ago

Describe the issue

Attempting to run this PyAnnote segmentation model with WebGPU produces the following error:

An error occurred during model execution: "Error: [WebGPU] Kernel "[MaxPool] /sincnet/pool1d.0/MaxPool" failed. Error: length of specified kernel shapes should be 2 less than length of input dimensions".

Running with WASM produces the correct result.

To reproduce

minimum reproduction: https://jsfiddle.net/omgkb3n1/1/

Urgency

Prevents use of this model in Transformers.js

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.18.0

Execution Provider

'webgpu' (WebGPU)

xenova commented 1 month ago

cc @guschmue for viz

gyagp commented 1 month ago

The root cause is ORT WebGPU only implements maxpool2d. But this model requires maxpool1d, then fails with the check "kernelShape.length !== inputDims.length - 2" in adjustPoolAttributes().

guschmue commented 1 month ago

I think there are more issues with pyannotate after the first maxpool, The Conv1D behind that maxpool is messing up its output shape as far I can tell. But worse with pyannotate: there are LSTMs behind the first 2 layers - don't think we have plans to support LSTM in webgpu.

guschmue commented 1 month ago

And the fix for maxpool looks good.

microsoft / onnxruntime