Dao-AILab / flash-attention

Fast and memory-efficient exact attention
BSD 3-Clause "New" or "Revised" License
13.61k stars 1.25k forks source link

Failed to build flash-attn #1067

Open qishisuren123 opened 2 months ago

qishisuren123 commented 2 months ago

torch==2.0.1,cuda11.4 6da338ee1961f23f5c28c1f8d1d338f5

nanowell commented 2 months ago

This repository is a total mess. Even if you succeed in building it, you will probably face a dozen other stupid bugs, like issue #915. It's better to use the PyTorch implementation of flash-attention.

siyouhe666 commented 1 month ago

ONE method I found may solve the error: my env OS:linux torch:2.1.1+cu121 GPU:A800 python:3.9.9

1.update the env config (bash) cd /usr/local (bash) ls you can see more file like bin cuda cuda-11.x cuda-12.x config your correct version to bashrc (bash) vim ~/.bashrc append content like.. export CUDA_HOME=/usr/loca/{your cuda version} for example to me, 'CUDA_HOME=/usr/loca/cuda-12.2' save and exit then execute refresh config (bash) source ~/.bashrc

2.reinstall flash-attn compile with source (bash) MAX_JOBS=4 python setup.py install but if you execute the command above you may get another error: cutlass/numeric_types.h:No such file or directory you could solve through: (bash) MAX_JOBS=4 pip install flash-attn