FlashAttention1.0.9 error in 2080Ti

Meituan-AutoML / MobileVLM

Strong and Open Vision Language Assistant for Mobile Devices

Apache License 2.0

996 stars 66 forks source link

FlashAttention1.0.9 error in 2080Ti #40

Open miraclewkf opened 6 months ago

miraclewkf commented 6 months ago

I use FlashAttention1.0.9 and it support Turing GPU(2080Ti)，but I get the errors： RuntimeError: FlashAttention backward for head dim > 64 requires A100 or H100 GPUs as the implementation needs a large amount of shared memory.

huyiming2018 commented 2 months ago

The main constraint is the size of shared memory.
As the above mentions, Head dim > 64 backward requires A100 or H100. The forward for head dim <= 128, and backward for head dim <= 64 works on other GPUs.