unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
16.51k stars 1.14k forks source link

TypeError: argument of type 'NoneType' is not iterable when merging weights to 16bit and pushing to hub #666

Open premsa opened 3 months ago

premsa commented 3 months ago

hey guys, I get the following error message, after successfully fine-tuning when trying to merge weights and push to hub:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/x/.venv/lib/python3.10/site-packages/unsloth/save.py", line 1211, in unsloth_push_to_hub_merged
    unsloth_save_model(**arguments)
  File "/x/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/x/.venv/lib/python3.10/site-packages/unsloth/save.py", line 686, in unsloth_save_model
    internal_model.save_pretrained(**save_pretrained_settings)
  File "/x/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2634, in save_pretrained
    model_card = create_and_tag_model_card(
  File "/x/projects/mistral-finetune/.venv/lib/python3.10/site-packages/transformers/utils/hub.py", line 1144, in create_and_tag_model_card
    if model_tag not in model_card.data.tags:
TypeError: argument of type 'NoneType' is not iterable
from unsloth import FastLanguageModel
import torch
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

from utils import dataset 

max_seq_length = 1048

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/mistral-7b-instruct-v0.3", # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B
    max_seq_length = 1048,
    dtype = None,
    load_in_4bit = True,
    )

model = FastLanguageModel.get_peft_model(
    model,
    r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 239,
    use_rslora = False,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
    )

trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False, # Can make training 5x faster for short sequences.
    args = TrainingArguments(
        per_device_train_batch_size = 10,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        #num_train_epochs = 1, 
        max_steps = 1, # Set num_train_epochs = 1 for full training runs
        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 239,
        output_dir = "outputs",
    ),
)

trainer_stats = trainer.train()

model.push_to_hub_merged("user/this-is-my-project", tokenizer, save_method = "merged_16bit", token)

The above code create the config files, but fails before the weights are stored.

When saving the adapter without merging, the script does not fail and stores the adapter weights.

model.push_to_hub("user/this-is-my-project", token = token) 
tokenizer.push_to_hub("user/this-is-my-project", token = token) 

My environment:

accelerate==0.31.0
aiohttp==3.9.5
aiosignal==1.3.1
async-timeout==4.0.3
attrs==23.2.0
bitsandbytes==0.43.1
certifi==2024.6.2
charset-normalizer==3.3.2
datasets==2.20.0
dill==0.3.7
docstring_parser==0.16
einops==0.8.0
filelock==3.15.1
flash-attn==2.5.9.post1
frozenlist==1.4.1
fsspec==2024.5.0
huggingface-hub==0.23.4
idna==3.7
Jinja2==3.1.4
markdown-it-py==3.0.0
MarkupSafe==2.1.5
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.15
networkx==3.3
ninja==1.11.1.1
numpy==2.0.0
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.5.40
nvidia-nvtx-cu12==12.1.105
packaging==24.1
pandas==2.2.2
peft==0.11.1
protobuf==3.20.3
psutil==5.9.8
pyarrow==16.1.0
pyarrow-hotfix==0.6
Pygments==2.18.0
python-dateutil==2.9.0.post0
pytz==2024.1
PyYAML==6.0.1
regex==2024.5.15
requests==2.32.3
rich==13.7.1
safetensors==0.4.3
sentencepiece==0.2.0
shtab==1.7.1
six==1.16.0
sympy==1.12.1
tokenizers==0.19.1
torch==2.3.0
tqdm==4.66.4
transformers==4.41.2
triton==2.3.0
trl==0.8.6
typing_extensions==4.12.2
tyro==0.8.4
tzdata==2024.1
unsloth @ git+https://github.com/unslothai/unsloth.git@87703089fa0ad60f008b7a7990f5cf3e77ccd26e
urllib3==2.2.2
xformers==0.0.26.post1
xxhash==3.4.1
yarl==1.9.4

Any ideas what could be going wrong?

danielhanchen commented 3 months ago

Ok that's weird - I tried on Colab and it's fine - did you add extra tags?

premsa commented 3 months ago

Ok that's weird - I tried on Colab and it's fine - did you add extra tags?

No, I did not add anything else besides what is in the above code! I am running the script on a H100 on the ampere version of unsloth.

danielhanchen commented 3 months ago

Hmm weird

whranclast commented 3 months ago

I've had the same issue and it comes from not having a tag in your hugging face directory, you need to create one manually.

danielhanchen commented 3 months ago

OO interesting I'll check the tag issue

rishiraj commented 1 month ago

@danielhanchen @premsa this should fix the issue https://github.com/huggingface/transformers/pull/33315