ROCm / triton

Development repository for the Triton language and compiler
MIT License
80 stars 23 forks source link

AMD specific scheduling pass for TTGIR instructions #483

Closed oplavsic closed 3 months ago

oplavsic commented 5 months ago

This PR introduces AMD specific scheduling pass. Main purpose it has for now is to hoist Q tensor out of the loop in FA fwd pass, and to schedule instructions produced by dot slicing pass.