Open JiahuaZhao opened 1 month ago
+1, the default branch "flash_attention_for_rocm" has 272 commits behind TriDao's repo, a lot of API are not compatible with some frameworks, anyway to resolve this? any new branchs?
I don't know what another +1 is worth, but catching up with specifically lower-right causal masking and paged attention would make a world of difference for ROCm users.
Suggestion Description
When we need to do long context inference (using LongLoRA), sometimes errors occur: need flash-attn version ≥2.1.0. So wondering if a newer version will follow.
Operating System
SUSE
GPU
MI250X
ROCm Component
ROCm 5.4.3