Closed Dusker233 closed 1 week ago
Oh are you using DataCollatorForCompletionOnlyLM
?
Ah I'm not sure if I used DataCollatorForCompletionOnlyLM
. Here's the code I load dataset:
from datasets import load_dataset
dataset = load_dataset("Dusker/chinese-laws-pretrain", split = "train")
EOS_TOKEN = tokenizer.eos_token
def formatting_prompts_func(examples):
title = examples["title"]
content = examples["content"]
texts = []
for t, c in zip(title, content):
# Must add EOS_TOKEN, otherwise your generation will go on forever!
text = t + ": " + c + EOS_TOKEN
texts.append(text)
return { "text" : texts, }
dataset = dataset.map(formatting_prompts_func, batched = True,)
Ok that's weird hmmm
Hi @Dusker233, can you install the nightly branch on kaggle and try again to see if the error is resolved ?
Hi @vTuanpham, not really resolved. I encountered another error:
RuntimeError Traceback (most recent call last)
Cell In[10], line 20
7 # 4bit pre quantized models we support for 4x faster downloading + no OOMs.
8 fourbit_models = [
9 "unsloth/mistral-7b-v0.3-bnb-4bit", # New Mistral v3 2x faster!
10 "unsloth/mistral-7b-instruct-v0.3-bnb-4bit",
(...)
17 "unsloth/gemma-7b-bnb-4bit", # Gemma 2.2x faster!
18 ] # More models at [https://huggingface.co/unsloth](https://huggingface.co/unsloth%3C/span%3E)
---> 20 model, tokenizer = FastLanguageModel.from_pretrained(
21 model_name = "Qwen/Qwen2-1.5B-Instruct", # "unsloth/mistral-7b" for 16bit loading
22 max_seq_length = max_seq_length,
23 dtype = dtype,
24 load_in_4bit = load_in_4bit,
25 # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
26 )
File /opt/conda/lib/python3.10/site-packages/unsloth/models/loader.py:210, in FastLanguageModel.from_pretrained(model_name, max_seq_length, dtype, load_in_4bit, token, device_map, rope_scaling, fix_tokenizer, trust_remote_code, use_gradient_checkpointing, resize_model_vocab, revision, *args, **kwargs)
203 if "rope_scaling" in error.lower() and not SUPPORTS_LLAMA31:
204 raise ImportError(
205 f"Unsloth: Your transformers version of {transformers_version} does not support new RoPE scaling methods.\n"\
206 f"This includes Llama 3.1. The minimum required version is 4.43.2\n"\
207 f'Try `pip install --upgrade "transformers>=4.43.2"`\n'\
208 f"to obtain the latest transformers build, then restart this session."\
209 )
--> 210 raise RuntimeError(autoconfig_error or peft_error)
211 pass
213 # Get base model for PEFT:
RuntimeError: Can't load the configuration of 'unsloth/qwen2-1.5b-instruct-bnb-4bit'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'unsloth/qwen2-1.5b-instruct-bnb-4bit' is the correct path to a directory containing a config.json file
more weird :(
I installed unsloth by these commands:
%%capture
!pip install pip3-autoremove
!pip-autoremove torch torchvision torchaudio -y
!pip install torch torchvision torchaudio xformers triton bitsandbytes trl peft
!pip install "unsloth[kaggle-nightly] @ git+https://github.com/unslothai/unsloth.git"
import os
os.environ["WANDB_DISABLED"] = "true"
still can't used by these commands:
%%capture
!mamba install --force-reinstall aiohttp -y
!pip install -U "xformers<0.0.26" --index-url https://download.pytorch.org/whl/cu121
!pip install "unsloth[kaggle-nightly] @ git+https://github.com/unslothai/unsloth.git"
# Temporary fix for https://github.com/huggingface/datasets/issues/6753
!pip install datasets==2.16.0 fsspec==2023.10.0 gcsfs==2023.10.0
!pip install bitsandbytes peft trl
import os
os.environ["WANDB_DISABLED"] = "true"
Hi @Dusker233, can you restart the session and try again with these commands:
%%capture
!pip install pip3-autoremove
!pip-autoremove torch torchvision torchaudio -y
!pip install torch torchvision torchaudio xformers triton
!pip install "unsloth[kaggle-new] @ git+https://github.com/unslothai/unsloth.git@nightly"
import os
os.environ["WANDB_DISABLED"] = "false"
Hey @vTuanpham, it works! Thanks a lot :)
Hi @danielhanchen . Right now I'm continue pretraining Qwen2-1.5B at Kaggle, but I encountered problems as below:
code:
error:
Is there any solution about that? It works well when I use Google Colab. Thx.