issue with Kaggle continue pretraining

Dusker233 commented 3 weeks ago

Hi @danielhanchen . Right now I'm continue pretraining Qwen2-1.5B at Kaggle, but I encountered problems as below:

code:

from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported
from unsloth import UnslothTrainer, UnslothTrainingArguments

trainer = UnslothTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 1,

    args = UnslothTrainingArguments(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 8,

        warmup_ratio = 0.1,
        num_train_epochs = 3,

        learning_rate = 5e-5,
        embedding_learning_rate = 5e-6,

        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.00,
        lr_scheduler_type = "cosine",
        seed = 3407,
        output_dir = "Qwen2-1.5B-pretrain",
        push_to_hub = True,
        hub_token = "",
        hub_private_repo = True,
        hub_model_id = "Qwen2-1.5B-pretrain",
    ),
)

error:

File /opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py:101, in _deprecate_arguments.<locals>._inner_deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
     99         message += "\n\n" + custom_message
    100     warnings.warn(message, FutureWarning)
--> 101 return f(*args, **kwargs)

File /opt/conda/lib/python3.10/site-packages/trl/trainer/sft_trainer.py:202, in SFTTrainer.__init__(self, model, args, data_collator, train_dataset, eval_dataset, tokenizer, model_init, compute_metrics, callbacks, optimizers, preprocess_logits_for_metrics, peft_config, dataset_text_field, packing, formatting_func, max_seq_length, infinite, num_of_sequences, chars_per_token, dataset_num_proc, dataset_batch_size, neftune_noise_alpha, model_init_kwargs, dataset_kwargs, eval_packing)
    197     warnings.warn(
    198         "You passed a `eval_packing` argument to the SFTTrainer, the value you passed will override the one in the `SFTConfig`."
    199     )
    200     args.eval_packing = eval_packing
--> 202 if args.packing and data_collator is not None and isinstance(data_collator, DataCollatorForCompletionOnlyLM):
    203     raise ValueError(
    204         "You passed a `DataCollatorForCompletionOnlyLM` to the SFTTrainer. This is not compatible with the `packing` argument."
    205     )
    207 if is_peft_available() and peft_config is not None:

AttributeError: 'UnslothTrainingArguments' object has no attribute 'packing'

Is there any solution about that? It works well when I use Google Colab. Thx.

danielhanchen commented 3 weeks ago

Oh are you using DataCollatorForCompletionOnlyLM?

Dusker233 commented 3 weeks ago

Ah I'm not sure if I used DataCollatorForCompletionOnlyLM. Here's the code I load dataset:

from datasets import load_dataset
dataset = load_dataset("Dusker/chinese-laws-pretrain", split = "train")
EOS_TOKEN = tokenizer.eos_token
def formatting_prompts_func(examples):
    title = examples["title"]
    content = examples["content"]
    texts = []
    for t, c in zip(title, content):
        # Must add EOS_TOKEN, otherwise your generation will go on forever!
        text = t + ": " + c + EOS_TOKEN
        texts.append(text)
    return { "text" : texts, }
dataset = dataset.map(formatting_prompts_func, batched = True,)

danielhanchen commented 2 weeks ago

Ok that's weird hmmm

vTuanpham commented 1 week ago

Hi @Dusker233, can you install the nightly branch on kaggle and try again to see if the error is resolved ?

Dusker233 commented 1 week ago

Hi @vTuanpham, not really resolved. I encountered another error:

RuntimeError                              Traceback (most recent call last)
Cell In[10], line 20
      7 # 4bit pre quantized models we support for 4x faster downloading + no OOMs.
      8 fourbit_models = [
      9     "unsloth/mistral-7b-v0.3-bnb-4bit",      # New Mistral v3 2x faster!
     10     "unsloth/mistral-7b-instruct-v0.3-bnb-4bit",
   (...)
     17     "unsloth/gemma-7b-bnb-4bit",             # Gemma 2.2x faster!
     18 ] # More models at [https://huggingface.co/unsloth](https://huggingface.co/unsloth%3C/span%3E)
---> 20 model, tokenizer = FastLanguageModel.from_pretrained(
     21     model_name = "Qwen/Qwen2-1.5B-Instruct", # "unsloth/mistral-7b" for 16bit loading
     22     max_seq_length = max_seq_length,
     23     dtype = dtype,
     24     load_in_4bit = load_in_4bit,
     25     # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
     26 )

File /opt/conda/lib/python3.10/site-packages/unsloth/models/loader.py:210, in FastLanguageModel.from_pretrained(model_name, max_seq_length, dtype, load_in_4bit, token, device_map, rope_scaling, fix_tokenizer, trust_remote_code, use_gradient_checkpointing, resize_model_vocab, revision, *args, **kwargs)
    203     if "rope_scaling" in error.lower() and not SUPPORTS_LLAMA31:
    204         raise ImportError(
    205             f"Unsloth: Your transformers version of {transformers_version} does not support new RoPE scaling methods.\n"\
    206             f"This includes Llama 3.1. The minimum required version is 4.43.2\n"\
    207             f'Try `pip install --upgrade "transformers>=4.43.2"`\n'\
    208             f"to obtain the latest transformers build, then restart this session."\
    209         ) 
--> 210     raise RuntimeError(autoconfig_error or peft_error)
    211 pass
    213 # Get base model for PEFT:

RuntimeError: Can't load the configuration of 'unsloth/qwen2-1.5b-instruct-bnb-4bit'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'unsloth/qwen2-1.5b-instruct-bnb-4bit' is the correct path to a directory containing a config.json file

more weird :(

I installed unsloth by these commands:

%%capture
!pip install pip3-autoremove
!pip-autoremove torch torchvision torchaudio -y
!pip install torch torchvision torchaudio xformers triton bitsandbytes trl peft
!pip install "unsloth[kaggle-nightly] @ git+https://github.com/unslothai/unsloth.git"

import os
os.environ["WANDB_DISABLED"] = "true"

still can't used by these commands:

%%capture
!mamba install --force-reinstall aiohttp -y
!pip install -U "xformers<0.0.26" --index-url https://download.pytorch.org/whl/cu121
!pip install "unsloth[kaggle-nightly] @ git+https://github.com/unslothai/unsloth.git"

# Temporary fix for https://github.com/huggingface/datasets/issues/6753
!pip install datasets==2.16.0 fsspec==2023.10.0 gcsfs==2023.10.0
!pip install bitsandbytes peft trl

import os
os.environ["WANDB_DISABLED"] = "true"

vTuanpham commented 1 week ago

Hi @Dusker233, can you restart the session and try again with these commands:

%%capture
!pip install pip3-autoremove
!pip-autoremove torch torchvision torchaudio -y
!pip install torch torchvision torchaudio xformers triton
!pip install "unsloth[kaggle-new] @ git+https://github.com/unslothai/unsloth.git@nightly"

import os
os.environ["WANDB_DISABLED"] = "false"

Dusker233 commented 1 week ago

Hey @vTuanpham, it works! Thanks a lot :)

unslothai / unsloth

issue with Kaggle continue pretraining #936