[URGENT] Colab is broken

danielhanchen commented 5 months ago

Colab is broken currently - working on a fix

danielhanchen commented 5 months ago

Fixed! Please change all install instructions at the top to

%%capture
# Installs Unsloth, Xformers (Flash Attention) and all other packages!
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers trl peft accelerate bitsandbytes

then change the SFTTrainer from

        fp16 = not torch.cuda.is_bf16_supported(),
        bf16 = torch.cuda.is_bf16_supported(),

to (plus place from unsloth import is_bfloat16_supported)

# Do from unsloth import is_bfloat16_supported
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),

HienBM commented 5 months ago

Fixed! Please change all install instructions at the top to

%%capture
# Installs Unsloth, Xformers (Flash Attention) and all other packages!
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers trl peft accelerate bitsandbytes

then change the SFTTrainer from

        fp16 = not torch.cuda.is_bf16_supported(),
        bf16 = torch.cuda.is_bf16_supported(),

to (plus place from unsloth import is_bfloat16_supported)

# Do from unsloth import is_bfloat16_supported
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),

Hi @danielhanchen , I have this error after your guide. Can you help me check?

HienBM commented 5 months ago

Fixed! Please change all install instructions at the top to
%%capture
# Installs Unsloth, Xformers (Flash Attention) and all other packages!
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers trl peft accelerate bitsandbytes
then change the SFTTrainer from
        fp16 = not torch.cuda.is_bf16_supported(),
        bf16 = torch.cuda.is_bf16_supported(),
to (plus place from unsloth import is_bfloat16_supported)
# Do from unsloth import is_bfloat16_supported
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
Hi @danielhanchen , I have this error after your guide. Can you help me check?

Hi guys, It worked again :)) Thank you!

R4ZZ3 commented 4 months ago

I am preparing sample finetune colab for our upcoming Llama2-3b release. But it still seems to fail for Llama3b (I have been able to finetune this locally with Unsloth)

Colab link if someone wants to take a loook: https://colab.research.google.com/drive/1HqSHW6H8vxhXpkyPr-vZAMw_6Ces0y50?usp=sharing

NotImplementedError: No operator found for memory_efficient_attention_forward with inputs: query : shape=(2, 1203, 32, 100) (torch.float16) key : shape=(2, 1203, 32, 100) (torch.float16) value : shape=(2, 1203, 32, 100) (torch.float16) attn_bias : <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'> p : 0.0 decoderF is not supported because: xFormers wasn't build with CUDA support attn_bias type is <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'> operator wasn't built - see python -m xformers.info for more info flshattF@0.0.0 is not supported because: xFormers wasn't build with CUDA support requires device with capability > (8, 0) but your GPU has capability (7, 5) (too old) operator wasn't built - see python -m xformers.info for more info query.shape[-1] % 8 != 0 cutlassF is not supported because: xFormers wasn't build with CUDA support operator wasn't built - see python -m xformers.info for more info query.shape[-1] % 8 != 0 value.shape[-1] % 8 != 0 smallkF is not supported because: max(query.shape[-1] != value.shape[-1]) > 32 xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) attn_bias type is <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'> operator wasn't built - see `pyth

danielhanchen commented 4 months ago

@R4ZZ3 So torch 2.2 needs another way to install Unsloth:

!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git@nightly"
!pip install --no-deps "xformers<0.0.26" trl peft accelerate bitsandbytes

R4ZZ3 commented 4 months ago

Tried it, now the references to CUDA are gone but error stays:

NotImplementedError: No operator found for memory_efficient_attention_forward with inputs: query : shape=(2, 859, 32, 100) (torch.float16) key : shape=(2, 859, 32, 100) (torch.float16) value : shape=(2, 859, 32, 100) (torch.float16) attn_bias : <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'> p : 0.0 decoderF is not supported because: attn_bias type is <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'> flshattF@v2.5.6 is not supported because: requires device with capability > (8, 0) but your GPU has capability (7, 5) (too old) query.shape[-1] % 8 != 0 cutlassF is not supported because: query.shape[-1] % 8 != 0 value.shape[-1] % 8 != 0 smallkF is not supported because: max(query.shape[-1] != value.shape[-1]) > 32 dtype=torch.float16 (supported: {torch.float32}) attn_bias type is <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'> unsupported embed per head: 100

danielhanchen commented 4 months ago

Wait oh the matrix sizes are weird - that model has weird shapes - 100? Very weird - it should be a multiple of 8

R4ZZ3 commented 4 months ago

Yeah that is a weird decision. Comes from easyLM configs https://huggingface.co/Finnish-NLP/llama-3b-finnish-v2/blob/main/EasyLM/models/llama/llama_model.py

But wonder how it works locally. Need to check what versions I am running and what might cause this. Locally I work on RTX4080 with these package versions.

xformers==0.0.27.dev792 unsloth @ git+https://github.com/unslothai/unsloth.git@2f2b478868f63b66aaaa93db66ab3d811cddc95e torch==2.3.0 bitsandbytes==0.43.1 flash-attn==2.5.8

I can ofc work with peft for now

danielhanchen commented 4 months ago

Hmm ye best to use PEFT for now sorry :(

HashemAlsaket commented 1 month ago

Any change here? Seeing the same issue

danielhanchen commented 1 month ago

@HashemAlsaket Sorry! What's the current issue you have? Do you have a screenshot?

unslothai / unsloth

[URGENT] Colab is broken #504