Closed AndersonPedrosa35 closed 12 hours ago
Instalation:
%%capture
# Installs Unsloth, Xformers (Flash Attention) and all other packages!
!pip install -U "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install -U --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes
!pip install --upgrade packaging
The error picture:
Error after soon import FastLanguageModel in unsloth at google colab T4:
main.py > ``` from unsloth import FastLanguageModel import torch max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally! dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+ load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.
4bit pre quantized models we support for 4x faster downloading + no OOMs.
fourbit_models = [ "unsloth/mistral-7b-v0.3-bnb-4bit", # New Mistral v3 2x faster! "unsloth/mistral-7b-instruct-v0.3-bnb-4bit", "unsloth/llama-3-8b-bnb-4bit", # Llama-3 15 trillion tokens model 2x faster! "unsloth/llama-3-8b-Instruct-bnb-4bit", "unsloth/llama-3-70b-bnb-4bit", "unsloth/Phi-3-mini-4k-instruct", # Phi-3 2x faster! "unsloth/Phi-3-medium-4k-instruct", "unsloth/mistral-7b-bnb-4bit", "unsloth/gemma-7b-bnb-4bit", # Gemma 2.2x faster! ] # More models at https://huggingface.co/unsloth model, tokenizer = FastLanguageModel.from_pretrained( model_name = "unsloth/Meta-Llama-3.1-8B", max_seq_length = max_seq_length, dtype = dtype, load_in_4bit = load_in_4bit,
token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)