Open muhammadumair894 opened 5 months ago
Is this a Mistral model?
Is this a Mistral model?
We're seeing this same problem, with Mistral NeMo. Is it anything you need to worry about, and is it possible to correct after fine tuning or does it invalidate our current model?
@hannesfant Oh is the output correct? They're just warnings
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's
attention_mask
to obtain reliable results. Settingpad_token_id
toeos_token_id
:2 for open-end generation. Unsloth: Not a fast tokenizer, so can't process it as of yet :( Please log a Github issue if you want this as a new feature! Your chat template will still work, but it won't add or edit tokens.`from unsloth.chat_templates import get_chat_template
tokenizer = get_chat_template( tokenizer, chat_template = "mistral", # Supports zephyr, chatml, mistral, llama-3, alpaca, vicuna, vicuna_old, unsloth mapping = {"role" : "from", "content" : "value", "user" : "human", "assistant" : "gpt"}, # ShareGPT style map_eos_token = True, # Maps <|im_end|> to instead )
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
messages = [ {"from": "human", "value": "I am on social security disability income. I own a house that has equity. It's homesteaded and we've been here for 20 years. I have little consumer debt, except a few charged off accounts in dispute"}, ] inputs = tokenizer.apply_chat_template( messages, tokenize = True, add_generation_prompt = True, # Must add for generation return_tensors = "pt", ).to("cuda")
outputs = model.generate(input_ids = inputs, max_new_tokens = 1024, use_cache = True) tokenizer.batch_decode(outputs)`