Closed netw0rkf10w closed 7 months ago
Hi @netw0rkf10w, this repo is synchronized to v2.0.4 of the upstream one.
Thanks @howiejayz for your reply. Could you tell me how the current repo compares to the triton implementation in your other fork in terms of performance? I'm trying to use flash attention on MI250x cards (and also MI300A ones), and am not sure which implementation I should use. Thank you in advance!
Hi @netw0rkf10w, I think you should try the triton one if possbile. This version of Flash-Attention for ROCm is relatively old and the performance is not updated to the triton implementation.
@howiejayz I see. Thanks a lot!
Hello, Thanks for your great work! I would like to know if the current implementation is FA v1 or v2. If it’s v1 then are you planning to upgrade to v2? Thank you in advance for your replies.