ROCm / aotriton

Ahead of Time (AOT) Triton Math Library
MIT License
42 stars 15 forks source link

Restore the support of causal=True and seqlen_q != seqlen_k #55

Closed xinyazhang closed 1 week ago

xinyazhang commented 2 weeks ago

Major changes

  1. Fix the Triton kernel to support causal=True and seqlen_q != seqlen_k
    • This matches efficient attention behavior (align to top left corner)
  2. Re-Enable the related tests
  3. Re-Enable the related tuning (also fixes OOM on long seqlen_q/k)
  4. Add the missing entries to tuning database for causal=True and seqlen_q != seqlen_k

Known Problems

No tuning entries for MI300X/Navi31 since we are going to re-run the tuning script for all cases no later than next week.

xinyazhang commented 2 weeks ago

Most tests passed (105 failed vs 36759 passed). The failed tests are due to higher numerical errors on MI200 hardware (and I didn't bother to fine tune the tolerance for MI200 right now).

image