triton-lang / triton

Development repository for the Triton language and compiler
https://triton-lang.org/
MIT License
13.5k stars 1.67k forks source link

[BACKEND] Fix getElemsPerThread for mmav3 dot operand #5189

Closed ThomasRaoux closed 6 days ago

ThomasRaoux commented 6 days ago

In mmav3 case the number of elements per threads should be independent of the element type, we should only consider kWidth. TODO: it should also be true for MMAv2 but the logic is a bit more complicated.

Also enable larger block_m in mixed mode tests to exercise MMAv3 case