Closed John-Yao closed 4 months ago
I use Multi-Query Attention(implemented by repeat_interleave) in conformer, and export to onnx model. The onnx model failed to run by onnx-runtime. The onnx model seem to be truncated
@John-Yao https://github.com/wenet-e2e/wenet/pull/2519 Please help verify
The onnx model run success by onnx-runtime. Nice!
I use Multi-Query Attention(implemented by repeat_interleave) in conformer, and export to onnx model. The onnx model failed to run by onnx-runtime. The onnx model seem to be truncated