Open jackaihfia2334 opened 1 year ago
same
same
@jackaihfia2334
I was not able to install/build for Windows (I don't think it's supported yet). So, I decided to create a new environment with WSL2 to use Flash Attention for training LLM purposes.
I started by installing the Text Generation WebUI requirements first. Followed by Cuda 11.8, NVCC for Cuda 11.8
Then I started to try installing flash-attention without success, I was getting the same error message as your issue, shell log seemed a bit different tho.
Lot's of troubleshooting later I finally got it working. Not sure if it'll work for you, but this might help.
sudo apt install wget
wget https://repo.anaconda.com/archive/Anaconda3-2023.07-1-Linux-x86_64.sh
sudo sh Anaconda3-2023.07-1-Linux-x86_64.sh (if you get stuck in the user agreement text, use 'q')
sudo apt install git
sudo apt install build-essential
pip install ninja
sudo apt install libxml2
conda config --add channels conda-forge
conda install -c conda-forge clang (I was getting g++ errors while compiling, not 100% this fixed it.)
conda install -c conda-forge clang-tools
conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit
conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc
sudo ln -s /usr/lib/wsl/lib/libcuda.so.1 /usr/local/cuda-11.8/lib64/libcuda.so
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
git clone https://github.com/facebookresearch/xformers
git submodule update --init --recursive
cd xformers
MAX_JOBS=4 python setup.py build (MAX_JOBS will dictate how many cores are assigned for compilation, more cores requires more memory. 122GB for 16 cores.)
python setup.py install
git clone https://github.com/Dao-AILab/flash-attention
cd flash-attention/
git submodule update --init --recursive
MAX_JOBS=4 python setup.py build (MAX_JOBS will dictate how many cores are assigned for compilation, more cores requires more memory. 96GB for 16 cores.)
python setup.py install
And Voilà!
如果是缺少 cutlass.h 头文件,可以把 https://github.com/NVIDIA/cutlass/tree/main/include 里边的 cutlass 和 cute 文件夹copy到csrc/flash_attn下面
def fixed_get_imports(filename: str | os.PathLike) -> list[str]: """Workaround for FlashAttention""" if os.path.basename(filename) != "modeling_florence2.py": return get_imports(filename) imports = get_imports(filename) imports.remove("flash_attn") return imports
this work for me while loading Florence-2 model
如果是缺少 cutlass.h 头文件,可以把 https://github.com/NVIDIA/cutlass/tree/main/include 里边的 cutlass 和 cute 文件夹copy到csrc/flash_attn下面
Solved my problem! Amazing! How do you know that?
note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for flash-attn Running setup.py clean for flash-attn Failed to build flash-attn ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects