Closed rwl4 closed 2 weeks ago
Agreed I want llama 3 support as well
yes yes working on it!
FIXED!!
I think you should update the readme asap for this :) it will be a good adv. @danielhanchen
Colab notebook: https://colab.research.google.com/drive/135ced7oHytdxu3N2DNe1Z0kqjyYIkDXp?usp=sharing
OMG THANK YOU SO MUCH! Already fine tuning my own models with this colab
Looking good, except the chat templating isn't quite right due to the tokenizer change.
FileNotFoundError Traceback (most recent call last)
Cell In[5], line 3
1 from unsloth.chat_templates import get_chat_template
----> 3 tokenizer = get_chat_template(
4 tokenizer,
5 chat_template = "chatml", unsloth
6 mapping = {"role" : "from", "content" : "value", "user" : "human", "assistant" : "gpt"},
7 map_eos_token = True,
8 )
10 def formatting_prompts_func(examples):
11 convos = examples["conversations"]
File ~/.local/lib/python3.10/site-packages/unsloth/chat_templates.py:379, in get_chat_template(tokenizer, chat_template, mapping, map_eos_token)
377 # Must fix the sentence piece tokenizer since there's no tokenizer.model file!
378 token_mapping = { old_eos_token : stop_word, }
--> 379 tokenizer = fix_sentencepiece_tokenizer(tokenizer, new_tokenizer, token_mapping,)
380 pass
382 else:
File ~/.local/lib/python3.10/site-packages/unsloth/tokenizer_utils.py:222, in fix_sentencepiece_tokenizer(old_tokenizer, new_tokenizer, token_mapping, temporary_location)
219 old_tokenizer.save_pretrained(temporary_location)
221 tokenizer_file = sentencepiece_model_pb2.ModelProto()
--> 222 tokenizer_file.ParseFromString(open(f"{temporary_location}/tokenizer.model", "rb").read())
224 # Now save the new tokenizer
225 new_tokenizer.save_pretrained(temporary_location)
FileNotFoundError: [Errno 2] No such file or directory: '_unsloth_sentencepiece_temp/tokenizer.model'
@danielhanchen
It's wierd I have this issue, both in unsloth and in LLaMA-Factory, same exact error, and only for the LLAMA3 models.
==((====))== Unsloth: Fast Llama patching release 2024.4
\\ /| GPU: NVIDIA GeForce RTX 4090. Max memory: 23.988 GB. Platform = Linux.
O^O/ \_/ \ Pytorch: 2.1.2+cu121. CUDA = 8.9. CUDA Toolkit = 12.1.
\ / Bfloat16 = TRUE. Xformers = 0.0.25.post1. FA = False.
"-____-" Free Apache license: http://github.com/unslothai/unsloth
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Traceback (most recent call last):
File "/home/workspace/unsl.py", line 53, in <module>
model, tokenizer = FastLanguageModel.from_pretrained(
File "/home/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/loader.py", line 132, in from_pretrained
model, tokenizer = dispatch_model.from_pretrained(
File "/home/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py", line 1085, in from_pretrained
tokenizer = load_correct_tokenizer(
File "/home/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/tokenizer_utils.py", line 262, in load_correct_tokenizer
fast_tokenizer.add_bos_token = slow_tokenizer.add_bos_token
AttributeError: 'PreTrainedTokenizerFast' object has no attribute 'add_bos_token'. Did you mean: '_bos_token
Edit: A complete reinstall solved it.
@rwl4 Working on chat template issues! Yep @Sneakr A complete reinstall would work - sorry on the issues
i have a doubt regarding llama3 finetuning. There are two versions of llama3 released: base and instruction finetuned. Is the current llama3 model (unsloth/llama-3-8b-bnb-4bit) the base model or instruction tuned? if its base model, will the instruction tuned model also be added?
@arunpatala Base model.
The Instruct is unsloth/llama-3-8b-Instruct-bnb-4bit
.
No the base model is purely a pretrained model with no instruction finetuning
Thanks for the information.
I am able to lora finetune with the instuct model now.
Noticing that non-quantized versions of Llama-3-70B don't seem to be available on Unsloth?
For example, here is non-quantized vs 4bit quantized Llama-3-8B:
On the other hand, only the 4bit 70B model appears to be available:
Very new to Unsloth, so I may very well be missing something here!
Sadly the non quantized versions are near impossible to finetune anyways with 16bit on a single GPU, so it's not uploaded
It looks like the tokenizer patching breaks. Here's the log: