Closed xerkey closed 5 months ago
Hi, thanks for trying the demo and being interested in our work!
This is weird because this model is not updated and we set do_sample=False
. Could you show your transformers and torch version? I tried again on my side, it still doesn't know the correct answer:
You could also try this:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
device = "cuda"
model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2",torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
input_text = """[INST] Answer the questions:
Question: What company advertised itself with the slogan "We'll leave a light on for you"? [/INST] The answer is:"""
encodeds = tokenizer(input_text, return_tensors="pt")
model_inputs = encodeds.to(device)
model.to(device)
generated_ids = model.generate(**model_inputs, max_new_tokens=20, do_sample=False,pad_token_id=tokenizer.eos_token_id)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
Thanks. I tried "mistralai/Mistral-7B-Instruct-v0.2" with above code. Then I get same output.
Question: What company advertised itself with the slogan "We'll leave a light on for you"? [/INST] The answer is: Motel 6. Motel 6 is a budget motel chain in the United States and
I understood that Mistral-7B had the knowledge.
My environ is follows: Apple Silicon (M2 Max) with miniforge transformers 4.42.0.dev0 torch 2.4.0.dev20240601
Maybe the precision problem? I am not so sure if apple silicon chip supports bf16. Could you try in a linux distribution and installed the packages as required in the docker file?
I got the catch--precision. when using torch_dtype=torch.float32
, the model could get the answer correctly. I believe this problem is hard to mitigate since all models use half precision for faster inference :(
I got it. Thanks for verifying! I'll verify on linux distribution when I have a chance.
The xrag-7b already knew about Motel 6 when I try tutorial.ipynb.
Was the model updated?
This is the response without RAG or xRAG.
But thanks for providing a helpful tutorial.