unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
18.65k stars 1.31k forks source link

[FIXED] Exception: data did not match any variant of untagged enum ModelWrapper at line 1251003 column 3 #1059

Open djannot opened 2 months ago

djannot commented 2 months ago

I get this error:

Traceback (most recent call last):
  File "/home/denis/Documents/ai/unsloth/llama3-chat-template.py", line 20, in <module>
    model, tokenizer = FastLanguageModel.from_pretrained(
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/unsloth/models/loader.py", line 323, in from_pretrained
    model, tokenizer = dispatch_model.from_pretrained(
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/unsloth/models/llama.py", line 1610, in from_pretrained
    tokenizer = load_correct_tokenizer(
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/unsloth/tokenizer_utils.py", line 538, in load_correct_tokenizer
    tokenizer = _load_correct_tokenizer(
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/unsloth/tokenizer_utils.py", line 496, in _load_correct_tokenizer
    fast_tokenizer = AutoTokenizer.from_pretrained(
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 897, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2271, in from_pretrained
    return cls._from_pretrained(
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2505, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 115, in __init__
    fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
Exception: data did not match any variant of untagged enum ModelWrapper at line 1251003 column 3

It works with nsloth/Llama-3.2-1B-Instruct-bnb-4bit

KaiDF commented 2 months ago

the same issue as mine.

danielhanchen commented 2 months ago

Oh it's best to update transformers via pip install --upgrade "transformers>=4.45"

djannot commented 2 months ago

Thanks @danielhanchen for the fast response (as usual).

I did try this, but I now get another error:

Traceback (most recent call last):
  File "/home/denis/Documents/ai/unsloth/llama3-chat-template.py", line 113, in <module>
    trainer_stats = trainer.train()
  File "<string>", line 145, in train
  File "<string>", line 358, in _fast_inner_training_loop
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/trainer.py", line 3477, in training_step
    self.optimizer.train()
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/accelerate/optimizer.py", line 128, in train
    return self.optimizer.train()
AttributeError: 'AdamW' object has no attribute 'train'
danielhanchen commented 2 months ago

Ok that's a weird error - are you using the notebooks we provided without any changes? It's possible HuggingFace's new update might have broken some parts

djannot commented 2 months ago

Yes, but I've just tried creating a new conda env and in that case it works.

So there was probably something weird going on with the upgrades of the different packages. Even if I still don't understand why it was working with the 1B model.

Anyway, you can close the issue. And thanks again for the replies.

sais-github commented 2 months ago

Yes, but I've just tried creating a new conda env and in that case it works.

This worked for me too 😸

KaiDF commented 2 months ago

but when inference, it occurs that 'ValueError: Invalid cache_implementation (dynamic). Choose one of: ['static', 'offloaded_static', 'sliding_window', 'hybrid', 'mamba', 'quantized', 'static']'

KaiDF commented 2 months ago

but when inference, it occurs that 'ValueError: Invalid cache_implementation (dynamic). Choose one of: ['static', 'offloaded_static', 'sliding_window', 'hybrid', 'mamba', 'quantized', 'static']'

this error has been fixed by upgrading the unsloth to version 2024.9.post3 and transformers to version 4.45.0

mf-skjung commented 2 months ago

Thanks @danielhanchen for the fast response (as usual).

I did try this, but I now get another error:

Traceback (most recent call last):
  File "/home/denis/Documents/ai/unsloth/llama3-chat-template.py", line 113, in <module>
    trainer_stats = trainer.train()
  File "<string>", line 145, in train
  File "<string>", line 358, in _fast_inner_training_loop
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/trainer.py", line 3477, in training_step
    self.optimizer.train()
  File "/home/denis/miniconda3/envs/pytorch/lib/python3.10/site-packages/accelerate/optimizer.py", line 128, in train
    return self.optimizer.train()
AttributeError: 'AdamW' object has no attribute 'train'

Upgrading accelerate to version 0.34.0 will resolve this issue.

sais-github commented 2 months ago

I'm running into this same error when trying to quantize the trained models into gguf format Exception: data did not match any variant of untagged enum ModelWrapper at line 1251003 column 3 Edit: The tokenizer unsloth exports is broken.

selectorseb commented 2 months ago

I am running into this same error as well when merging and exporting the 16bit model and using it on vllm. I have tried multiple models and the error is consistent. Most definitely the tokenizer exporter is broken Edit: by using latest version of docker image form vllm it now works (v0.6.2)

lianghsun commented 1 month ago

I encountered the same issue as @selectorseb while deploying a finetuned Llama-3.2 model using vLLM with Docker. Initially, I faced the same problem mentioned in the original post @djannot, but after updating the vLLM Docker image, the issue was resolved.

danielhanchen commented 1 month ago

@KaiDF Apologies forgot to mention for you to update Unsloth!! Glad it works now! Sorry on the issue!

@mf-skjung I'll actually edit pyproject.toml to log this - thanks!

On the rest of the issues - so the solution seems to update vllm>=0.6.2? Ie pip install --upgrade "vllm>=0.6.2"

riddle-today commented 1 month ago

I am running a notebook on google collaborator and still have this issue. I am trying to read a checkpoint from a LLAMA model fine tuned with LoRa. Yesterday, it worked fine, but today that changed. image If I update to transformers 4.45 I receive another error. (invalid repository id)

danielhanchen commented 1 month ago

@riddle-today Apologies sorry - can you screenshot the error - the picture you provided is just a warning - you can ignore that!

riddle-today commented 1 month ago

image It is the same error as @djannot . The picture before was to show the version of transformers, unsloth and xformers I am using. Thank you so much for the prompt answer @danielhanchen .

riddle-today commented 1 month ago

If I go and download the tokenizer files from the HuggingFace repository and replace them, it works.

teamclouday commented 1 month ago

Updating tokenizers to latest 0.20.0 might help

danielhanchen commented 1 month ago

@teamclouday Oh wait try not to update it to 0.20!! Transformers will error out!!

@riddle-today Oh yep apologies I forgot to mention you have to override the tokenizer with the latest one I uploaded!

tongyx361 commented 1 month ago

If I go and download the tokenizer files from the HuggingFace repository and replace them, it works.

This resolves Exception: data did not match any variant of untagged enum ModelWrapper ... for me, too! It seems like some saving error?

danielhanchen commented 1 month ago

@tongyx361 Apologies on the delay - ye the new transformers update broke saving - so you need overwrite the old tokenizer file up redownloading them

srsugandh commented 1 month ago

Can somebody list down the steps to override the tokenizer file. I am new to this. Need Help!

katopz commented 1 month ago

Can somebody list down the steps to override the tokenizer file. I am new to this. Need Help!

Form my understand is

  1. download tokenizer file from original repo
  2. and replace/upload to yours.

But i still stuck with other issue after that so can't confirm.

srsugandh commented 1 month ago

I am still facing this issue, I have the latest "2024.10.7" version but unsloth requires transformers < 4.45, but it is not working when I take transformers < 4.45 getting same error

danielhanchen commented 1 month ago

@katopz @srsugandh Can you guys ask this on our Discord - probably a better place to get this resolved

ai-nikolai commented 3 weeks ago

I am still facing this issue, I have the latest "2024.10.7" version but unsloth requires transformers < 4.45, but it is not working when I take transformers < 4.45 getting same error

@katopz @srsugandh Can you guys ask this on our Discord - probably a better place to get this resolved

@katopz @danielhanchen @srsugandh - same problem here. Unsloth requires transformers < 4.45, but that doesn't work. So should we manually install a higher version of transformers to fix this issue?

ai-nikolai commented 3 weeks ago

Notebook with a working version:

@danielhanchen @katopz - here is a notebook for "offline" installation on Kaggle: (https://www.kaggle.com/code/kolyan1/offline-unsloth-package-installation-pt-2-working)

Generally one work-around is as follows:

  1. Install unsloth==2024.10.4 and torch==2.4.1
    pip3 install unsloth==2024.10.4 torch==2.4.1
  2. Install transformers==4.45.2 (this will throw errors, as transformers has to be <4.45 for unsloth, but it still installs successfully and works)
    pip3 install transformers==4.45.2
srsugandh commented 3 weeks ago

I am still facing this issue, I have the latest "2024.10.7" version but unsloth requires transformers < 4.45, but it is not working when I take transformers < 4.45 getting same error

@katopz @srsugandh Can you guys ask this on our Discord - probably a better place to get this resolved

@katopz @danielhanchen @srsugandh - same problem here. Unsloth requires transformers < 4.45, but that doesn't work. So should we manually install a higher version of transformers to fix this issue?

I found a work around. I did pip install to get the latest version of unsloth then uninstalled it and then used the github commit to install the unsloth (pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"). This is because just using the commit does not install the related libraries and then I installed the transformer with version 4.45.1 and it worked.