Open ari9dam opened 1 year ago
try MAX_JOBS=4
Tried MAX_JOBS=4. It failed as well. MAX_JOBS=1 timed out after 1h30m.
@tridao ? Any pointers? Thanks in advance!
There's not enough info here (there's no error message from the compilation log pointing to any specific line). You can try the recommended docker file from Nvidia.
How to add pip install flash-attn --no-build-isolation in requirements.txt ?
I met with similar problem, I fixed it by installing latest pytorch via pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121
Besides, use MAX_JOBS=4
would reduce memory usage.
@ari9dam
IMO, use MAX_JOBS=1
to find out the exact error.
in my case
.../python3.8/site-packages/torch/include/torch/csrc/python_headers.h:12:10: fatal error: Python.h: No such file o
r directory
12 | #include <Python.h>
| ^~~~~~~~~~
which was addressed with also, make sure nothing else (like a training session is running on GPU (not sure how this affects but killing this helped)
sudo apt install python3-dev
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121
MAX_JOBS=8 pip install flash-attn --no-build-isolation
on a 216GB Ram system. I did install pytorch with cu121 support, and also, cuda 12.1 was manually installed!
I think there are a wide variety of factors in play here. For me, I could not build the docker with flash attn in A100, but (someone) was able to build the same docker in V100 (which I later used in A100). I did not see gain with flash-attn V2 on LLAMA V2. Flash-attn v1 performed better than V2. I replaced the LLAMA MHA forward function as commonly done in hot-patching. My per iteration time (per device batch size 10, seq len 4K) increased from 88 seconds -> 94 seconds while I switched to V2.
On Mon, Aug 7, 2023 at 7:45 PM Sadra @.***> wrote:
@ari9dam https://urldefense.com/v3/__https://github.com/ari9dam__;!!IKRxdwAv5BmarQ!ZvLWEF_fnbirCDpxsUWWXuoRHEBLOoACkxqNdPkMkv-acuGdtoJrdG_fp2RtZpH6CevCsNaNsG4f7H3mlsNWtrc$ IMO, use MAX_JOBS=1 to find out the exact error. in my case
.../python3.8/site-packages/torch/include/torch/csrc/python_headers.h:12:10: fatal error: Python.h: No such file o r directory 12 | #include
| ^ ~~~~~which was addressed with also, make sure nothing else (like a training session is running on GPU (not sure how this affects but killing this helped)
sudo apt install python3-dev
I did install pytorch with cu121 support, and also, cuda 12.1 was manually installed!
— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/Dao-AILab/flash-attention/issues/420*issuecomment-1668826337__;Iw!!IKRxdwAv5BmarQ!ZvLWEF_fnbirCDpxsUWWXuoRHEBLOoACkxqNdPkMkv-acuGdtoJrdG_fp2RtZpH6CevCsNaNsG4f7H3mP3AKDZo$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ADL24YQO6UEW5CIRNLNYBJDXUGR3VANCNFSM6AAAAAA3DTBZGI__;!!IKRxdwAv5BmarQ!ZvLWEF_fnbirCDpxsUWWXuoRHEBLOoACkxqNdPkMkv-acuGdtoJrdG_fp2RtZpH6CevCsNaNsG4f7H3mbHW5FK0$ . You are receiving this because you were mentioned.Message ID: @.***>
torch 2.1.0 cuda 12.1 g++ 10.2.1
执行:
apt-get update && apt-get install -y g++ pip install packaging pip install ninja pip install flash-attn --no-build-isolation
报错如下:
Building wheels for collected packages: flash-attn Building wheel for flash-attn (setup.py): started Building wheel for flash-attn (setup.py): still running... Building wheel for flash-attn (setup.py): finished with status 'error' error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [10 lines of output] No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda' fatal: not a git repository (or any of the parent directories): .git
torch.version = 2.1.0.dev20230815+cu121
running bdist_wheel Guessing wheel URL: https://github.com/Dao-AILab/flash-attention/releases/download/v2.1.1/flash_attn-2.1.1+cu121torch2.1cxx11abiFALSE-cp39-cp39-linux_x86_64.whl error: Remote end closed connection without response [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for flash-attn Running setup.py clean for flash-attn Failed to build flash-attn ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects ERROR: executor failed running [/bin/sh -c pip install flash-attn --no-build-isolation]: runc did not terminate successfully: exit status 1
I've tried install from both pip and source code, but no luck 😢
$ pip install flash-attn --no-build-isolation
Collecting flash-attn
Using cached flash_attn-2.4.2.tar.gz (2.4 MB)
Preparing metadata (setup.py) ... done
Requirement already satisfied: torch in /home/myun/miniconda3/envs/myen/lib/python3.8/site-packages (from flash-attn) (2.1.0)
Requirement already satisfied: einops in /home/myun/miniconda3/envs/myen/lib/python3.8/site-packages (from flash-attn) (0.7.0)
Requirement already satisfied: packaging in /home/myun/miniconda3/envs/myen/lib/python3.8/site-packages (from flash-attn) (23.2)
Collecting ninja (from flash-attn)
Using cached ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl.metadata (5.3 kB)
Requirement already satisfied: filelock in /home/myun/miniconda3/envs/myen/lib/python3.8/site-packages (from torch->flash-attn) (3.13.1)
Requirement already satisfied: typing-extensions in /home/myun/miniconda3/envs/myen/lib/python3.8/site-packages (from torch->flash-attn) (4.9.0)
Requirement already satisfied: sympy in /home/myun/miniconda3/envs/myen/lib/python3.8/site-packages (from torch->flash-attn) (1.12)
Requirement already satisfied: networkx in /home/myun/miniconda3/envs/myen/lib/python3.8/site-packages (from torch->flash-attn) (3.1)
Requirement already satisfied: jinja2 in /home/myun/miniconda3/envs/myen/lib/python3.8/site-packages (from torch->flash-attn) (3.1.2)
Requirement already satisfied: fsspec in /home/myun/miniconda3/envs/myen/lib/python3.8/site-packages (from torch->flash-attn) (2023.10.0)
Requirement already satisfied: MarkupSafe>=2.0 in /home/myun/miniconda3/envs/myen/lib/python3.8/site-packages (from jinja2->torch->flash-attn) (2.1.3)
Requirement already satisfied: mpmath>=0.19 in /home/myun/miniconda3/envs/myen/lib/python3.8/site-packages (from sympy->torch->flash-attn) (1.3.0)
Using cached ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl (307 kB)
Building wheels for collected packages: flash-attn
Building wheel for flash-attn (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [34 lines of output]
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
fatal: not a git repository (or any of the parent directories): .git
torch.__version__ = 2.1.0
running bdist_wheel
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-5sq4q_lb/flash-attn_ace098e663d9463aad67312fd7b22387/setup.py", line 285, in <module>
setup(
File "/home/myun/miniconda3/envs/myen/lib/python3.8/site-packages/setuptools/__init__.py", line 103, in setup
return distutils.core.setup(**attrs)
File "/home/myun/miniconda3/envs/myen/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/home/myun/miniconda3/envs/myen/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/home/myun/miniconda3/envs/myen/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/home/myun/miniconda3/envs/myen/lib/python3.8/site-packages/setuptools/dist.py", line 963, in run_command
super().run_command(command)
File "/home/myun/miniconda3/envs/myen/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-install-5sq4q_lb/flash-attn_ace098e663d9463aad67312fd7b22387/setup.py", line 262, in run
wheel_url, wheel_filename = get_wheel_url()
File "/tmp/pip-install-5sq4q_lb/flash-attn_ace098e663d9463aad67312fd7b22387/setup.py", line 231, in get_wheel_url
torch_cuda_version = parse(torch.version.cuda)
File "/home/myun/miniconda3/envs/myen/lib/python3.8/site-packages/packaging/version.py", line 54, in parse
return Version(version)
File "/home/myun/miniconda3/envs/myen/lib/python3.8/site-packages/packaging/version.py", line 198, in __init__
match = self._regex.search(version)
TypeError: expected string or bytes-like object
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for flash-attn
Running setup.py clean for flash-attn
Failed to build flash-attn
ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects
and checking the /usr/local/cuda
dir:
$ ll /usr/local/cuda/
total 144
drwxr-xr-x 17 root root 4096 Dec 26 20:21 ./
drwxr-xr-x 12 root root 4096 Dec 26 20:20 ../
drwxr-xr-x 3 root root 4096 Dec 26 20:21 bin/
drwxr-xr-x 5 root root 4096 Dec 26 20:20 compute-sanitizer/
-rw-r--r-- 1 root root 160 Dec 26 20:21 DOCS
-rw-r--r-- 1 root root 61498 Dec 26 20:21 EULA.txt
drwxr-xr-x 5 root root 4096 Dec 26 20:21 extras/
drwxr-xr-x 6 root root 4096 Dec 26 20:20 gds/
drwxr-xr-x 2 root root 4096 Dec 26 20:20 gds-12.1/
lrwxrwxrwx 1 root root 28 Dec 26 20:21 include -> targets/x86_64-linux/include/
lrwxrwxrwx 1 root root 24 Dec 26 20:21 lib64 -> targets/x86_64-linux/lib/
drwxr-xr-x 7 root root 4096 Dec 26 20:21 libnvvp/
drwxr-xr-x 7 root root 4096 Dec 26 20:20 nsight-compute-2023.1.0/
drwxr-xr-x 2 root root 4096 Dec 26 20:20 nsightee_plugins/
drwxr-xr-x 6 root root 4096 Dec 26 20:21 nsight-systems-2023.1.2/
drwxr-xr-x 3 root root 4096 Dec 26 20:20 nvml/
drwxr-xr-x 7 root root 4096 Dec 26 20:21 nvvm/
-rw-r--r-- 1 root root 524 Dec 26 20:21 README
drwxr-xr-x 3 root root 4096 Dec 26 20:20 share/
drwxr-xr-x 2 root root 4096 Dec 26 20:20 src/
drwxr-xr-x 3 root root 4096 Dec 26 20:20 targets/
drwxr-xr-x 2 root root 4096 Dec 26 20:21 tools/
-rw-r--r-- 1 root root 2928 Dec 26 20:20 version.json
and the ninja:
$ ninja --version
1.11.1
$ echo $?
0
Because your torch.version.cuda isn't installed right. Check your torch with: $python import torch print(torch.version.cuda)
I just installed my CUDA=11.6 different with system CUDA=10.2, so my torch was reinstalled automaticly to the CPU version.How incredible
pip install flash-attn --no-build-isolation Collecting flash-attn Using cached flash_attn-2.5.6.tar.gz (2.5 MB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [20 lines of output]
fatal: not a git repository (or any of the parent directories): .git
/tmp/pip-install-18brac5p/flash-attn_5e692969183644e58161eed68af0341b/setup.py:78: UserWarning: flash_attn was requested, but nvcc was not found. Are you sure your environment has nvcc available? If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.
warnings.warn(
Traceback (most recent call last):
File "
torch.__version__ = 2.0.1+cu117
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed
× Encountered error while generating package metadata. ╰─> See above for output.
note: This is an issue with the package mentioned above, not pip. hint: See above for details.
I did check the torch version and its 11.7.
Same thing happened to me
I have the same qustion $ pip install flash-attn --no-build-isolation -i https://pypi.tuna.tsinghua.edu.cn/simple Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting flash-attn Using cached https://pypi.tuna.tsinghua.edu.cn/packages/21/cb/33a1f833ac4742c8adba063715bf769831f96d99dbbbb4be1b197b637872/flash_attn-2.5.7.tar.gz (2.5 MB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [22 lines of output]
fatal: not a git repository (or any of the parent directories): .git
/tmp/pip-install-39fz2cjo/flash-attn_eaede92fcb76455eab13852d3126d861/setup.py:78: UserWarning: flash_attn was requested, but nvcc was not found. Are you sure your environment has nvcc available? If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.
warnings.warn(
Traceback (most recent call last):
File "/mnt/flash-attention-2.5.7/setup.py", line 134, in
I have the same qustion $ pip install flash-attn --no-build-isolation -i https://pypi.tuna.tsinghua.edu.cn/simple Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting flash-attn Using cached https://pypi.tuna.tsinghua.edu.cn/packages/21/cb/33a1f833ac4742c8adba063715bf769831f96d99dbbbb4be1b197b637872/flash_attn-2.5.7.tar.gz (2.5 MB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [22 lines of output] fatal: not a git repository (or any of the parent directories): .git /tmp/pip-install-39fz2cjo/flash-attn_eaede92fcb76455eab13852d3126d861/setup.py:78: UserWarning: flash_attn was requested, but nvcc was not found. Are you sure your environment has nvcc available? If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc. warnings.warn( Traceback (most recent call last): File "/mnt/flash-attention-2.5.7/setup.py", line 134, in CUDAExtension( File "/opt/conda/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1076, in CUDAExtension library_dirs += library_paths(cuda=True) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1203, in library_paths if (not os.path.exists(_join_cuda_home(lib_dir)) and ^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2416, in _join_cuda_home raise OSError('CUDA_HOME environment variable is not set. ' OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
you can fix this by add the environment: CUDA_HOME=/path/to/your/cuda
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 hi, can you try this gcc version?
pip install flash-attn --no-build-isolation fails but pip install flash-attn==1.0.9 --no-build-isolation works
Based on this can you say what I might to try to fix the error?
note: This error originates from a subprocess, and is likely not a problem with pip. [0m[91merror: legacy-install-failure
× Encountered error while trying to install package. ╰─> flash-attn
note: This is an issue with the package mentioned above, not pip. hint: See above for output from the failure. [0mThe command '/bin/bash --login -c pip install flash-attn' returned a non-zero code: 1 2023/08/03 22:09:43 Container failed during run: acb_step_0. No retries remaining. failed to run step ID: acb_step_0: exit status 1