Open medhasreenivasan opened 3 months ago
Yes , i am also facing the same issue from today.
@tolas92 @medhasreenivasan Hey! Sorry on the delay! I can reproduce this error - Colab seems to have updated some of their packages, causing Colab to break - working on a fix now! Thanks again and apologies!
A temporary fix is instead of doing unsloth[colab] @ git+https://github.com/unslothai/unsloth.git
, do the following:
!pip install "https://download.pytorch.org/whl/cu121/xformers-0.0.24-cp310-cp310-manylinux2014_x86_64.whl" --no-deps
!pip install --upgrade transformers datasets sentencepiece tyro
!pip install --upgrade bitsandbytes accelerate trl peft --no-deps
!pip install git+https://github.com/unslothai/unsloth.git
And for Flash Attn, add the final
!pip install flash-attn einops ninja --no-deps
as well.
I'm working on a more elegant fix
I am also facing the same issue. but @danielhanchen 's temporary fix is useful.
%%capture
import torch
major_version, minor_version = torch.cuda.get_device_capability()
if major_version >= 8:
# Use this for new GPUs like Ampere, Hopper GPUs (RTX 30xx, RTX 40xx, A100, H100, L40)
# !pip install "unsloth[colab-ampere] @ git+https://github.com/unslothai/unsloth.git"
!pip install "https://download.pytorch.org/whl/cu121/xformers-0.0.24-cp310-cp310-manylinux2014_x86_64.whl" --no-deps
!pip install --upgrade transformers datasets sentencepiece tyro
!pip install --upgrade bitsandbytes accelerate trl peft --no-deps
!pip install git+https://github.com/unslothai/unsloth.git
!pip install flash-attn einops ninja --no-deps
Thank you.
Ok looks like I fixed it! The new section at the top will be:
%%capture
import torch
major_version, minor_version = torch.cuda.get_device_capability()
# Must install separately since Colab has torch 2.2.1, which breaks packages
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
if major_version >= 8:
# Use this for new GPUs like Ampere, Hopper GPUs (RTX 30xx, RTX 40xx, A100, H100, L40)
!pip install --no-deps packaging ninja einops flash-attn xformers trl peft accelerate bitsandbytes
else:
# Use this for older GPUs (V100, Tesla T4, RTX 20xx)
!pip install --no-deps xformers trl peft accelerate bitsandbytes
pass
So still ugly, but I'll handle these issues later - I updated all notebooks to use this new approach.
thanks @danielhanchen for the quick fix!!.
Ok looks like I fixed it! The new section at the top will be:
%%capture import torch major_version, minor_version = torch.cuda.get_device_capability() # Must install separately since Colab has torch 2.2.1, which breaks packages !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" if major_version >= 8: # Use this for new GPUs like Ampere, Hopper GPUs (RTX 30xx, RTX 40xx, A100, H100, L40) !pip install --no-deps packaging ninja einops flash-attn xformers trl peft accelerate bitsandbytes else: # Use this for older GPUs (V100, Tesla T4, RTX 20xx) !pip install --no-deps xformers trl peft accelerate bitsandbytes pass
So still ugly, but I'll handle these issues later - I updated all notebooks to use this new approach.
It works perfectly. Thanks @danielhanchen
Hey @danielhanchen I am facing this issue during inference:
NotImplementedError: No operator found for memory_efficient_attention_forward
with inputs:
query : shape=(1, 2327, 8, 4, 128) (torch.float16)
key : shape=(1, 2327, 8, 4, 128) (torch.float16)
value : shape=(1, 2327, 8, 4, 128) (torch.float16)
attn_bias : <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'>
p : 0.0
flshattF@0.0.0
is not supported because:
xFormers wasn't build with CUDA support
requires device with capability > (8, 0) but your GPU has capability (7, 5) (too old)
operator wasn't built - see python -m xformers.info
for more info
tritonflashattF
is not supported because:
xFormers wasn't build with CUDA support
requires device with capability > (8, 0) but your GPU has capability (7, 5) (too old)
operator wasn't built - see python -m xformers.info
for more info
operator does not support BMGHK format
triton is not available
requires GPU with sm80 minimum compute capacity, e.g., A100/H100/L4
cutlassF
is not supported because:
xFormers wasn't build with CUDA support
operator wasn't built - see python -m xformers.info
for more info
smallkF
is not supported because:
max(query.shape[-1] != value.shape[-1]) > 32
xFormers wasn't build with CUDA support
dtype=torch.float16 (supported: {torch.float32})
attn_bias type is <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'>
operator wasn't built - see python -m xformers.info
for more info
operator does not support BMGHK format
unsupported embed per head: 128
@koleshjr Oh no how are you doing inference? On Colab? Did you manage to use the new install instructions in our Colab notebooks?
Yes I did , it's failing for free tier t4 when you call model.generate but for v100 it's passing.
@danielhanchen These are the new imports that you have suggested in this thread
import torch major_version, minor_version = torch.cuda.get_device_capability()
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" if major_version >= 8:
!pip install --no-deps packaging ninja einops flash-attn xformers trl peft accelerate bitsandbytes
else:
!pip install --no-deps xformers trl peft accelerate bitsandbytes
pass
This is how i am calling the model for inference
outputs = model.generate(**inputs,max_new_tokens = 1012, use_cache = True) result = tokenizer.batch_decode(outputs) result
Only failing for Google collab t4
@koleshjr Would you happen to have a screenshot of the erro?
@danielhanchen Apparently for some reason it now fixed. Sorry For this. I appreciate your feedback though. Thanks
No problems at all!
I am seeing this after using your update as well, but for training. Hoping it fixes itself like it did for @danielhanchen but thought I'd share a screenshot.
Ah restarting the session did not work but killing the runtime and starting fresh did
@reneric Oh great you solved the issue!
I am getting the below error while trying to import the FastLangugeModel from unsloth while using an A100 GPU on colab.
Failed to import transformers.integrations.peft because of the following error (look up to see its traceback): cannot import name 'set_guard_fail_hook' from 'torch._dynamo.eval_frame'
Is there a solution available for this issue? Thank you!