huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.91k stars 26.51k forks source link

special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. #26768

Closed twers1 closed 11 months ago

twers1 commented 11 months ago

image

i have a python code:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

device = 'cpu'

model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
model.to('cpu')
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")

vegeterian_recipe_prompt = """### Instruction: Act as a gourmet chef. 
I have a friend coming over who is a vegetarian.
I want to impress my friend with a special vegetarian dish. 
What do you recommend?

Give me two options, along with the whole recipe for each.

 ### Answer:
 """

encoded_instruction = tokenizer(vegeterian_recipe_prompt,  return_tensors="pt", add_special_tokens=True)

model_inputs = encoded_instruction.to(device)

generated_ids = model.generate(**model_inputs, max_new_tokens=500, do_sample=True, pad_token_id=tokenizer.eos_token_id)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
ArthurZucker commented 11 months ago

Actually this should not be trigger as the tokens that are added are added at the beginning of the vocab. Thanks for reporting!

twers1 commented 11 months ago

Actually this should not be trigger as the tokens that are added are added at the beginning of the vocab. Thanks for reporting!

how can I get an answer from artificial intelligence then? he sends me this and doesn't finish the command "python openai.py"

ArthurZucker commented 11 months ago

It will be fixed by #26570 ! Otherwise it's just a warning should not impact your code

MittelmanDaniel commented 1 month ago

I still have this issue when using the library on kaggle

image

ArthurZucker commented 1 month ago

I think we recently removed thewarning: #32138