Closed danielhanchen closed 5 months ago
Fixed! Please change all install instructions at the top to
%%capture
# Installs Unsloth, Xformers (Flash Attention) and all other packages!
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers trl peft accelerate bitsandbytes
then change the SFTTrainer
from
fp16 = not torch.cuda.is_bf16_supported(),
bf16 = torch.cuda.is_bf16_supported(),
to (plus place from unsloth import is_bfloat16_supported
)
# Do from unsloth import is_bfloat16_supported
fp16 = not is_bfloat16_supported(),
bf16 = is_bfloat16_supported(),
Fixed! Please change all install instructions at the top to
%%capture # Installs Unsloth, Xformers (Flash Attention) and all other packages! !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" !pip install --no-deps xformers trl peft accelerate bitsandbytes
then change the
SFTTrainer
fromfp16 = not torch.cuda.is_bf16_supported(), bf16 = torch.cuda.is_bf16_supported(),
to (plus place
from unsloth import is_bfloat16_supported
)# Do from unsloth import is_bfloat16_supported fp16 = not is_bfloat16_supported(), bf16 = is_bfloat16_supported(),
Hi @danielhanchen , I have this error after your guide. Can you help me check?
Fixed! Please change all install instructions at the top to
%%capture # Installs Unsloth, Xformers (Flash Attention) and all other packages! !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" !pip install --no-deps xformers trl peft accelerate bitsandbytes
then change the
SFTTrainer
fromfp16 = not torch.cuda.is_bf16_supported(), bf16 = torch.cuda.is_bf16_supported(),
to (plus place
from unsloth import is_bfloat16_supported
)# Do from unsloth import is_bfloat16_supported fp16 = not is_bfloat16_supported(), bf16 = is_bfloat16_supported(),
Hi @danielhanchen , I have this error after your guide. Can you help me check?
Hi guys, It worked again :)) Thank you!
I am preparing sample finetune colab for our upcoming Llama2-3b release. But it still seems to fail for Llama3b (I have been able to finetune this locally with Unsloth)
Colab link if someone wants to take a loook: https://colab.research.google.com/drive/1HqSHW6H8vxhXpkyPr-vZAMw_6Ces0y50?usp=sharing
NotImplementedError: No operator found for memory_efficient_attention_forward
with inputs:
query : shape=(2, 1203, 32, 100) (torch.float16)
key : shape=(2, 1203, 32, 100) (torch.float16)
value : shape=(2, 1203, 32, 100) (torch.float16)
attn_bias : <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'>
p : 0.0
decoderF
is not supported because:
xFormers wasn't build with CUDA support
attn_bias type is <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'>
operator wasn't built - see python -m xformers.info
for more info
flshattF@0.0.0
is not supported because:
xFormers wasn't build with CUDA support
requires device with capability > (8, 0) but your GPU has capability (7, 5) (too old)
operator wasn't built - see python -m xformers.info
for more info
query.shape[-1] % 8 != 0
cutlassF
is not supported because:
xFormers wasn't build with CUDA support
operator wasn't built - see python -m xformers.info
for more info
query.shape[-1] % 8 != 0
value.shape[-1] % 8 != 0
smallkF
is not supported because:
max(query.shape[-1] != value.shape[-1]) > 32
xFormers wasn't build with CUDA support
dtype=torch.float16 (supported: {torch.float32})
attn_bias type is <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'>
operator wasn't built - see `pyth
@R4ZZ3 So torch 2.2 needs another way to install Unsloth:
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git@nightly"
!pip install --no-deps "xformers<0.0.26" trl peft accelerate bitsandbytes
Tried it, now the references to CUDA are gone but error stays:
NotImplementedError: No operator found for memory_efficient_attention_forward
with inputs:
query : shape=(2, 859, 32, 100) (torch.float16)
key : shape=(2, 859, 32, 100) (torch.float16)
value : shape=(2, 859, 32, 100) (torch.float16)
attn_bias : <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'>
p : 0.0
decoderF
is not supported because:
attn_bias type is <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'>
flshattF@v2.5.6
is not supported because:
requires device with capability > (8, 0) but your GPU has capability (7, 5) (too old)
query.shape[-1] % 8 != 0
cutlassF
is not supported because:
query.shape[-1] % 8 != 0
value.shape[-1] % 8 != 0
smallkF
is not supported because:
max(query.shape[-1] != value.shape[-1]) > 32
dtype=torch.float16 (supported: {torch.float32})
attn_bias type is <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'>
unsupported embed per head: 100
Wait oh the matrix sizes are weird - that model has weird shapes - 100? Very weird - it should be a multiple of 8
Yeah that is a weird decision. Comes from easyLM configs https://huggingface.co/Finnish-NLP/llama-3b-finnish-v2/blob/main/EasyLM/models/llama/llama_model.py
But wonder how it works locally. Need to check what versions I am running and what might cause this. Locally I work on RTX4080 with these package versions.
xformers==0.0.27.dev792 unsloth @ git+https://github.com/unslothai/unsloth.git@2f2b478868f63b66aaaa93db66ab3d811cddc95e torch==2.3.0 bitsandbytes==0.43.1 flash-attn==2.5.8
I can ofc work with peft for now
Hmm ye best to use PEFT for now sorry :(
Any change here? Seeing the same issue
@HashemAlsaket Sorry! What's the current issue you have? Do you have a screenshot?
Colab is broken currently - working on a fix