ROCm / triton

Development repository for the Triton language and compiler
MIT License
80 stars 22 forks source link

Causal masking with dissimilar Q/KV sequence lengths #519

Closed vgokhale closed 4 months ago

vgokhale commented 4 months ago

1) Added support for causal masking for MHA, MQA and GQA fwd kernel 2) Causal masking now works with dissimilar sequence lengths 3) Removed vector bias - this is unrelated, but is not needed and I did not want to keep maintaining unnecessary code 4) Couple other unrelated bugfixes found during code review.