Dao-AILab / flash-attention

Fast and memory-efficient exact attention
BSD 3-Clause "New" or "Revised" License
13.43k stars 1.22k forks source link

Building wheel for flash-attn (pyproject.toml) did not run successfully #225

Open MilesQLi opened 1 year ago

MilesQLi commented 1 year ago

I attached the log. I installed g++: Reading package lists... Done Building dependency tree... Done Reading state information... Done g++ is already the newest version (4:11.2.0-1ubuntu1). g++ set to manually installed. The following package was automatically installed and is no longer required: libreoffice-ogltrans Use 'sudo apt autoremove' to remove it. 0 upgraded, 0 newly installed, 0 to remove and 213 not upgraded.

See attached log. error_log_flash_attn.txt

tridao commented 1 year ago

The main error is

/usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with ‘...’:
        435 |         function(_Functor&& __f)

I found some similar issue where downgrading to gcc 10 (instead of 11) works better with nvcc (idk why). https://github.com/stotko/stdgpu/issues/337

I usually use the Nvidia's pytorch container with gcc 9.4 and that works fine.

MilesQLi commented 1 year ago

@tridao Thanks a lot for the quick reply! But after I downgrade gcc, it has the same issue.

gcc (Ubuntu 10.4.0-4ubuntu1~22.04) 10.4.0 Copyright (C) 2020 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

See log please. error_log_flash_attn.txt

tridao commented 1 year ago

Yeah I'm not sure what's wrong, could just be compiler versions. You can try the nvidia docker image.

MilesQLi commented 1 year ago

Switching to gcc 10 works. Even though I changed the default compiler to gcc 10, it still used gcc 11 to compile.

ryurobin1990 commented 1 year ago

@MilesQLi I have the same issue, may I ask what did you do to switch to gcc 10?

eduardm commented 1 year ago

@ryurobin1990 sudo apt install gcc-10 g++-10 sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 10 --slave /usr/bin/g++ g++ /usr/bin/g++-10 sudo update-alternatives --config gcc sudo apt purge --autoremove -y gcc-11

tbenst commented 1 year ago

@ryurobin1990 sudo apt install gcc-10 g++-10 sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 10 --slave /usr/bin/g++ g++ /usr/bin/g++-10 sudo update-alternatives --config gcc sudo apt purge --autoremove -y gcc-11

Tried this, and now getting subprocess.CalledProcessError: Command '['which', 'c++']' returned non-zero exit status 1.

On Sherlock (where I can install) I see,

$ which c++
/share/software/user/open/gcc/10.3.0/bin/c++

But I do not have c++ on my PATH in ubuntu. Really hard to google for help on this error, can't find any reference to what the c++ executable does, perhaps it's an alias for g++ ...?

edit: I was able to build after doing sudo ln -s /usr/bin/g++-10 /usr/bin/c++

also, in future someone could try CXX=g++-10 CC=gcc-10 LD=g++-10 pip install flash-attn --no-build-isolation

eduardm commented 1 year ago

tbenst Just symlink c++ to g++

whcpumpkin commented 1 year ago

@ryurobin1990 sudo apt install gcc-10 g++-10 sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 10 --slave /usr/bin/g++ g++ /usr/bin/g++-10 sudo update-alternatives --config gcc sudo apt purge --autoremove -y gcc-11

Tried this, and now getting subprocess.CalledProcessError: Command '['which', 'c++']' returned non-zero exit status 1.

On Sherlock (where I can install) I see,

$ which c++
/share/software/user/open/gcc/10.3.0/bin/c++

But I do not have c++ on my PATH in ubuntu. Really hard to google for help on this error, can't find any reference to what the c++ executable does, perhaps it's an alias for g++ ...?

edit: I was able to build after doing sudo ln -s /usr/bin/g++-10 /usr/bin/c++

also, in future someone could try CXX=g++-10 CC=gcc-10 LD=g++-10 pip install flash-attn --no-build-isolation

after running @eduardm 's instructions, I try sudo ln -s /usr/bin/g++-10 /usr/bin/c++ and CXX=g++-10 CC=gcc-10 LD=g++-10 pip install flash-attn --no-build-isolation, and they work for me.

kyleboddy commented 7 months ago

Running this command purged a ton of my nvidia drivers:

sudo apt purge --autoremove -y gcc-11

Removing libnvblas12:amd64 (12.2.5.6~12.2.2-0lambda0.22.04.1) ... Removing nvidia-headless-no-dkms-535 (535.129.03-0lambda0.22.04.1) ... Removing nvidia-compute-utils-535 (535.129.03-0lambda0.22.04.1) ... Removing libnvidia-cfg1-535:amd64 (535.129.03-0lambda0.22.04.1) ... Removing libnvidia-extra-535:amd64 (535.129.03-0lambda0.22.04.1) ... Removing libnvidia-ml-dev:amd64 (12.2.140~12.2.2-0lambda0.22.04.1) ... Removing libnvvm4:amd64 (12.2.140~12.2.2-0lambda0.22.04.1) ... Removing nvidia-kernel-common-535 (535.129.03-0lambda0.22.04.1) ... update-initramfs: deferring update (trigger activated) Removing nvidia-firmware-535-535.129.03 (535.129.03-0lambda0.22.04.1) ... Removing nvidia-kernel-source-535 (535.129.03-0lambda0.22.04.1) ... Removing nvidia-opencl-dev:amd64 (12.2.140~12.2.2-0lambda0.22.04.1) ... Removing nvidia-utils-535 (535.129.03-0lambda0.22.04.1) ... Removing ocl-icd-opencl-dev:amd64 (2.2.14-3) ... Removing ocl-icd-libopencl1:amd64 (2.2.14-3) ... Removing opencl-clhpp-headers (3.0~2.0.15-1ubuntu1) ... Removing opencl-c-headers (3.0~2022.01.04-1) ... Removing python3-ml-dtypes (0.2.0+eigen20230307.7bf2968-0lambda0.22.

So I can't recommend that approach. Looking into downgrading in a more sane manner and trying these instructions later.

kyleboddy commented 7 months ago

After fixing those issues, I tried:

sudo apt install gcc-10 g++-10
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 10 --slave /usr/bin/g++ g++ /usr/bin/g++-10
sudo update-alternatives --config gcc

Then:

sudo ln -s /usr/bin/g++-10 /usr/bin/c++

kyle@aeternum:~$ which c++ /usr/bin/c++

And some more system details:

CUDA:

kyle@aeternum:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0

Python (dev-10 installed as well);

kyle@aeternum:~$ python3 -V
Python 3.10.12

G++ / C++ info:

kyle@aeternum:~$ c++ -v Using built-in specs. COLLECT_GCC=c++ COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/10/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 10.5.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-10 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-mutex Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 10.5.0 (Ubuntu 10.5.0-1ubuntu1~22.04)

And run:

kyle@aeternum:~$ MAX_JOBS=8 CXX=g++-10 CC=gcc-10 LD=g++-10 pip install flash-attn --no-build-isolation

I still get the same two errors of the wheel not installing, and then setup.py not installing either.

RuntimeError: Error compiling objects for extension error: legacy-install-failure

I'll look around the repo for more help - but has anyone else had success here. I got Flash Attention 2 installed just fine on another system, but this one isn't playing nice. Both are AMD architecture and both have RTX 3090s with the Lambda Stack installed, CUDA 12.0+, nvidia-driver-535, cuda-toolkit, etc. Only difference is that this is a Ryzen Threadripper 2970WX compared to a 3rd gen Ryzen. Both have 128 GB of RAM and NVME drives.

Anyone else have success in slaying this repeated beast?

kyleboddy commented 7 months ago

I went back to gcc-11 and then threw a hail mary and re-ran the command forgetting I kept the envars at gcc-10 and this somehow worked. I have no idea why. Just updating here. Good luck everyone.

kyle@aeternum:~$ sudo update-alternatives --display gcc
[sudo] password for kyle:
gcc - auto mode
  link best version is /usr/bin/gcc-10
  link currently points to /usr/bin/gcc-10
  link gcc is /usr/bin/gcc
  slave g++ is /usr/bin/g++
/usr/bin/gcc-10 - priority 10
  slave g++: /usr/bin/g++-10
kyle@aeternum:~$ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-11 20 --slave /usr/bin/g++ g++ /usr/bin/g++-11
update-alternatives: using /usr/bin/gcc-11 to provide /usr/bin/gcc (gcc) in auto mode
kyle@aeternum:~$ sudo update-alternatives --config gcc
There are 2 choices for the alternative gcc (providing /usr/bin/gcc).

  Selection    Path             Priority   Status
------------------------------------------------------------
* 0            /usr/bin/gcc-11   20        auto mode
  1            /usr/bin/gcc-10   10        manual mode
  2            /usr/bin/gcc-11   20        manual mode

Press <enter> to keep the current choice[*], or type selection number:
kyle@aeternum:~$ gcc --version
g++ --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

kyle@aeternum:~$ MAX_JOBS=12 CXX=g++-10 CC=gcc-10 LD=g++-10 pip install flash-attn
Defaulting to user installation because normal site-packages is not writeable
Collecting flash-attn
  Using cached flash_attn-2.5.2.tar.gz (2.5 MB)
  Preparing metadata (setup.py) ... done
Requirement already satisfied: einops in ./.local/lib/python3.10/site-packages (from flash-attn) (0.7.0)
Requirement already satisfied: ninja in ./.local/lib/python3.10/site-packages (from flash-attn) (1.11.1.1)
Requirement already satisfied: packaging in /usr/lib/python3/dist-packages (from flash-attn) (21.3)
Requirement already satisfied: torch in /usr/lib/python3/dist-packages (from flash-attn) (2.0.1)
Building wheels for collected packages: flash-attn
  Building wheel for flash-attn (setup.py) ... done
  Created wheel for flash-attn: filename=flash_attn-2.5.2-cp310-cp310-linux_x86_64.whl size=120121301 sha256=2c393365c5be96ccba654797854256b12686ef68458bbc5bf7a3038e6490807f
  Stored in directory: /home/kyle/.cache/pip/wheels/ff/8e/8b/5fb0f8eb882c58b2bcfb4302860884d1b45d8513f8b3daac9c
Successfully built flash-attn
Installing collected packages: flash-attn
Successfully installed flash-attn-2.5.2