[CUDA, DML] MatMul does not properly handle matrices with inner dim == 0

microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

https://onnxruntime.ai

MIT License

14.65k stars 2.93k forks source link

[CUDA, DML] MatMul does not properly handle matrices with inner dim == 0 #21483

Open yuslepukhin opened 3 months ago

yuslepukhin commented 3 months ago

Describe the issue

MatMul is expected to produce a valid result when it is multiplying matrices with inner dimension equal to zero. For example, operands of shapes {16,0} x {0, 16} should produce a zero filled matrix of shape {16, 16}.

This is properly supported in CPU EP, but it is confirmed not to work in CUDA and DML providers.

This feature is necessary to support current design of Lora Adapaters in GenAI, as well as for correctness.

To reproduce

CUDA complains about dimensions equal to zero.

Urgency

No response

Platform

Windows

OS Version

Windows 11

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.18.1

ONNX Runtime API

C++

Architecture

X64

Execution Provider

DirectML

Execution Provider Library Version

No response

fdwr commented 3 months ago

Yeah, that's illegal from the DirectML API validator point of view, multiplying nothing times nothing and expecting something 😉. One could argue the output (since no multiplication actually occurred) should be NaN's instead. Though, why is a model generator producing such a degenerate operation, rather than just outputting a ConstantOfShape or Expand? Is there more context near the pertinent graph region you can show (via Netron) of what operators come before and after?

skottmckay commented 3 months ago

I too was very suprised that you could make magic up data from nothing, and that there was a default value to use which wasn't specified anywhere.

But the spec says "behaves like numpy.matmul" and numpy matul does indeed produce zeros.

yuslepukhin commented 3 months ago

Eigen that powers our CPU EP implementation does the same as numpy.

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.