Dao-AILab / flash-attention

Fast and memory-efficient exact attention
BSD 3-Clause "New" or "Revised" License
11.83k stars 1.05k forks source link

support for Quadro RTX 8000? #562

Open Crazy-LittleBoy opened 9 months ago

Crazy-LittleBoy commented 9 months ago

flash attention 1 support turing, but flash attention 2 not ?

tridao commented 9 months ago

Yup, it's mentioned in the README

FlashAttention-2 currently supports:

Ampere, Ada, or Hopper GPUs (e.g., A100, RTX 3090, RTX 4090, H100). Support for Turing GPUs (T4, RTX 2080) is coming soon, please use FlashAttention 1.x for Turing GPUs for now.
chuanzhubin commented 2 months ago

Support for Turing GPUs (T4, RTX 2080) is coming soon. Looking forward to it. @tridao

tridao commented 2 months ago

Unfortunately I've had no bandwidth to work on this. We welcome contributions.