ROCm / triton

Development repository for the Triton language and compiler
MIT License
92 stars 29 forks source link

[GEMM] [Tuning] Add `waves_per_eu` to gemm tuning #362

Closed zhanglx13 closed 1 year ago

zhanglx13 commented 1 year ago

And reduce tuning time by fixing a bug in the pre-compile step

More details: Previously, the pre-compiled step is done in parallel. However, the compiled kernels are not cached. Therefore, at the tuning step, we still pay the overhead of kernel compilation.