ROCm / triton

Development repository for the Triton language and compiler
MIT License
80 stars 23 forks source link

Autotune matrix_instr_nonkdim #499

Closed htyu closed 5 months ago

htyu commented 5 months ago

Adding matrix_instr_nonkdim to the autotune configs to enable tuning with different mfma shapes.

zhanglx13 commented 5 months ago

@htyu Why do you need to add matrix_instr_nonkdim to the autotuner? Is adding it here like waves_per_eu not enough?

htyu commented 5 months ago

@htyu Why do you need to add matrix_instr_nonkdim to the autotuner? Is adding it here like waves_per_eu not enough?

Oh, I didn't know about waves_per_eu. What does it do? We basically want to use the 16x16 mfma instruction instead of the default 32x32 one. Can it be achieved by tuning waves_per_eu?

zhanglx13 commented 5 months ago

@htyu waves_per_eu has nothing to do with matrix_instr_nonkdim. I referred to it because you can use matrix_instr_nonkdim (add it to kernel arg and tuning) in the same way as waves_per_eu. So there is no need to add it to the autotuner.

htyu commented 5 months ago

@htyu waves_per_eu has nothing to do with matrix_instr_nonkdim. I referred to it because you can use matrix_instr_nonkdim (add it to kernel arg and tuning) in the same way as waves_per_eu. So there is no need to add it to the autotuner.

I see. That also works. Thanks!