huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.34k stars 27.09k forks source link

Flash attention build running forever on colab #34466

Open BoccheseGiacomo opened 3 weeks ago

BoccheseGiacomo commented 3 weeks ago

BUG DESCRIPTION

Running on google colab a script to finetune LLAMA 3 8B with flash attention. This issue is not directly related to transformers but to an extension library: flash attention

During the installation of the last package "flash-attn" i get the following line in the console running forever: Building wheels for collected packages: flash-attn The issue was not present before october 15 2024 and this installation worked fine

System Info

Running on google colab a script to finetune LLAMA 3 8B with flash attention.

Setup of packages:

!pip install -U transformers
!pip install -U datasets
!pip install -U accelerate
!pip install -U peft
!pip install -U trl
!pip install -U bitsandbytes
!pip install -U wandb
!pip install -U flash-attn

Who can help?

No response

Information

Tasks

Reproduction

  1. go to google colab, and set A100 gpu

  2. Setup the following code for downloading packages

    !pip install -U transformers
    !pip install -U datasets
    !pip install -U accelerate
    !pip install -U peft
    !pip install -U trl
    !pip install -U bitsandbytes
    !pip install -U wandb
    !pip install -U flash-attn
  3. wait

The issue was not present before october 15 2024 and this installation worked fine

Expected behavior

Building wheels should terminate in 1-2 minutes max, instead it never ends, tried also to wait for 30 minutes.

benjamin-marie commented 3 weeks ago

I confirm but it's probably not related to transformers.

FlashAttention takes many hours to build with PyTorch 2.5. If you downgrade to 2.4, it should work: !pip install torch=='2.4.1+cu121' torchvision=='0.19.1+cu121' torchaudio=='2.4.1+cu121' --index-url https://download.pytorch.org/whl/cu121

nabeelsana commented 2 weeks ago

I confirm but it's probably not related to transformers.

FlashAttention takes many hours to build with PyTorch 2.5. If you downgrade to 2.4, it should work: !pip install torch=='2.4.1+cu121' torchvision=='0.19.1+cu121' torchaudio=='2.4.1+cu121' --index-url https://download.pytorch.org/whl/cu121

Thanks a lot , this has resolved the issue