Open complexfilter opened 4 months ago
Hi, We're following closely what's happening. The current implementation has some bugs which we reported, but when it's ready it will be integrated :) https://github.com/Dao-AILab/flash-attention/issues/1052
Hi, We're following closely what's happening. The current implementation has some bugs which we reported, but when it's ready it will be integrated :) Dao-AILab/flash-attention#1052
Look like it has been fixed
Indeed :) We're working on this, hopefully we can have it in xFormers during next week as an experimental feature
This is taking a bit more time than expected. Hopefully we will have it by next week but not sure.
This is taking a bit more time than expected. Hopefully we will have it by next week but not sure.
It seems current code already included FA3 impl? Any updates about how to enable/disable it? @danthe3rd
For now _USE_FLASH_ATTENTION_3=False
is set by default in ops/fmha/dispatch.py
🚀 Feature
Support Flash Attention 3
Motivation
Flash Attention 3 has been proved to greatly accelerate Flash Attention 2 on H100.
Pitch
Offer Flash Attention 3 support