Open thackmann opened 1 month ago
Thank you for miraculous "unsloth"!! IT was working very well las week.
Now, i am having the same problem that @thackmann:
My notebook -> transformers 4.44.2 (the same last week).
Error: llama runner process has terminated: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file
Same issue!
Same issue!
same issue.
facing similar issues is there a fix?? I m blocked!
same issue with llama3.2 3B , any solution please
Hey guys working on a fix. The new transformers version kind of broke everything
Same issue .. anyone have an idea where the problem is located.
same issue with llama3.2 3B , any solution please
Yes. I tried to work around. Using llama.cpp but it didnt worked. The issues arise when we fine-tune and save the model.
Same issue. Huge bummer - literally spent hours fine tuning and uploading to HF to get these error the past couple of days thinking it was me.
same issue here.
thank you @shimmyshimmer for working on the fix!
Hey guys. Yes, this is a current issue. But the boys are working to fix it. If you saved LORA, you might not have to rerun training.
There is a workaround that was posted here and it worked for me.
https://github.com/unslothai/unsloth/issues/1062#issuecomment-2379161471
There is a workaround that was posted here and it worked for me.
https://github.com/unslothai/unsloth/issues/1062#issuecomment-2379161471
This will not work for Llama 3.2 models.
same issue!!
same issue
same issue here, any fix anyone?
here is the error i get aftery trying to run a ft model via ollama
Error: llama runner process has terminated: error loading modelvocabulary: cannot find tokenizer merges in model file
I have same issue with llama 3 llama.cpp error: 'error loading model vocabulary: cannot find tokenizer merges in model file '
Apologies guys - was out for a few days and its been hectic, so sorry on the delay!! Will get to the bottom of fix and hopefully can fix it today! Sorry and thank you all for your patience!
I can reproduce the error - in fact all of llama.cpp and thus Ollama etc do not work with transformers>=4.45.1
- I'll update everyone on a fix - it looks like HuggingFace's update most likely broke something in tokenizer exports
@danielhanchen check this comment out, see if it helps.
https://github.com/huggingface/tokenizers/issues/1553#issuecomment-2243927115
I just communicated with the Hugging Face team - they will upstream updates to llama.cpp
later in the week. It seems like tokenizers>=0.20.0
is the culprit.
I re-uploaded all Llama-3.2
models and as a temporary fix, Unsloth will use transformers==4.44.2
.
Please try again and see if it works! This unfortunately means you need to re-finetune the model if you did not save the 16bit merged HF weights or the LoRA weights - extreme apologiesnif you saved them, simply update Unsloth then reload them and convert to GGUF.
Update Unsloth via:
pip uninstall unsloth -y
pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
I will update everyone once the Hugging Face team resolves the issue! Sorry again!
Pinging everyone (and apologies for the issues and inconvenience again!!) @xmaayy @avvRobertoAlma @thackmann @kingabzpro @williamzebrowskI @FotieMConstant @laoc81 @gianmarcoalessio @ThaisBarrosAlvim @Franky-W @Saber120 @adampetr @David33706 @Mukunda-Gogoi
Thanks @danielhanchen, and sorry for the disturbances; to give the context as to what is happening here, we updated the format of merges serialization in tokenizers
to be much more flexible (this was done in this commit):
The change was done to be backwards-compatible : tokenizers
and all libraries that depend on it will keep the ability to load merge files which were serialized in the old way.
However, it could not be forwards-compatible: if a file is serialized with the new format, older versions of tokenizers
will not be able to load it.
This is why we're seeing this issue: new files are serialized using the new version, and these files are not loadable in llama.cpp, yet. We're updating all other codepaths (namely llama.cpp) to adapt to the new version. Once that is shipped, all your trained checkpoints will be directly loadable as usual. We're working with llama.cpp to ship this as fast as possible.
Thank you!
Issue tracker in llama.cpp: https://github.com/ggerganov/llama.cpp/issues/9692
Sorry for the poor wording! Yep so if anyone has already saved the LoRA or 16bit weights (before converting to GGUF or ollama) you can reload it in Unsloth then save again after updating unsloth as a temporary solution as well.
I just communicated with the Hugging Face team - they will upstream updates to
llama.cpp
later in the week. It seems liketokenizers>=0.20.0
is the culprit.I re-uploaded all
Llama-3.2
models and as a temporary fix, Unsloth will usetransformers==4.44.2
.Please try again and see if it works! This unfortunately means you need to re-finetune the model if you did not save the 16bit merged HF weights or the LoRA weights - extreme apologiesnif you saved them, simply update Unsloth then reload them and convert to GGUF.
Update Unsloth via:
pip uninstall unsloth -y pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
I will update everyone once the Hugging Face team resolves the issue! Sorry again!
Pinging everyone (and apologies for the issues and inconvenience again!!) @xmaayy @avvRobertoAlma @thackmann @kingabzpro @williamzebrowskI @FotieMConstant @laoc81 @gianmarcoalessio @ThaisBarrosAlvim @Franky-W @Saber120 @adampetr @David33706 @Mukunda-Gogoi
Thank you for the update! I followed the steps you provided, and I’m happy to report that it worked perfectly on my end. I updated Unsloth, reloaded the saved weights, and successfully converted them to GGUF. Everything is running smoothly now with the transformers==4.44.2 fix.
I appreciate the quick re-upload and the detailed instructions. I’ll keep an eye out for the official update from Hugging Face, but for now, everything seems to be working great.
Thanks again for your efforts!
Best regards,
Thank you @danielhanchen for the quick fix. The original notebook is now working.
The fix is not working on Kaggle.
I just communicated with the Hugging Face team - they will upstream updates to
llama.cpp
later in the week. It seems liketokenizers>=0.20.0
is the culprit.I re-uploaded all
Llama-3.2
models and as a temporary fix, Unsloth will usetransformers==4.44.2
.Please try again and see if it works! This unfortunately means you need to re-finetune the model if you did not save the 16bit merged HF weights or the LoRA weights - extreme apologiesnif you saved them, simply update Unsloth then reload them and convert to GGUF.
Update Unsloth via:
pip uninstall unsloth -y pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
I will update everyone once the Hugging Face team resolves the issue! Sorry again!
Pinging everyone (and apologies for the issues and inconvenience again!!) @xmaayy @avvRobertoAlma @thackmann @kingabzpro @williamzebrowskI @FotieMConstant @laoc81 @gianmarcoalessio @ThaisBarrosAlvim @Franky-W @Saber120 @adampetr @David33706 @Mukunda-Gogoi
I get this error when i run the collab after applying the changes, seems to be an issue
@kingabzpro I just updated pypi so pip install unsloth
should have the temporary fixes - you might have to restart Kaggle
@kingabzpro I just updated pypi so
pip install unsloth
should have the temporary fixes - you might have to restart Kaggle
It is working on Kaggle now. Thank you.
Thank you for developing this useful resource. The Ollama notebook reports
{"error":"llama runner process has terminated: error loading modelvocabulary: cannot find tokenizer merges in model file"}
This is the notebook with the error. It is a copy of the original notebook.
This seems similar to the issue reported in #1062.