[Tuning] Add `matrix_instr_nonkdim` in the tuning space

ROCm / triton

Development repository for the Triton language and compiler

MIT License

80 stars 23 forks source link

Closed zhanglx13 closed 6 months ago

zhanglx13 commented 6 months ago

GEMM Tuning Script v3.1

With the help of https://github.com/ROCmSoftwarePlatform/triton/pull/466, using mfma16 can have better performance than using mfma32. Therefore, this PR adds matrix_instr_nonkdim into the tuning space.

Also fixed an issue with wrong type for the warmup kernel.