Hannibal046 / xRAG

[Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token
92 stars 7 forks source link

The xrag-7b already knew about Motel 6. #3

Closed xerkey closed 5 months ago

xerkey commented 5 months ago

The xrag-7b already knew about Motel 6 when I try tutorial.ipynb.

Was the model updated?

This is the response without RAG or xRAG.

Motel 6. Motel 6 is a budget motel chain in the United States and

But thanks for providing a helpful tutorial.

Hannibal046 commented 5 months ago

Hi, thanks for trying the demo and being interested in our work!

This is weird because this model is not updated and we set do_sample=False. Could you show your transformers and torch version? I tried again on my side, it still doesn't know the correct answer:

image

Hannibal046 commented 5 months ago

You could also try this:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

device = "cuda" 

model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2",torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")

input_text = """[INST] Answer the questions:

Question: What company advertised itself with the slogan "We'll leave a light on for you"? [/INST] The answer is:"""

encodeds = tokenizer(input_text, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(**model_inputs, max_new_tokens=20, do_sample=False,pad_token_id=tokenizer.eos_token_id)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
xerkey commented 5 months ago

Thanks. I tried "mistralai/Mistral-7B-Instruct-v0.2" with above code. Then I get same output.

Question: What company advertised itself with the slogan "We'll leave a light on for you"? [/INST] The answer is: Motel 6. Motel 6 is a budget motel chain in the United States and

I understood that Mistral-7B had the knowledge.

My environ is follows: Apple Silicon (M2 Max) with miniforge transformers 4.42.0.dev0 torch 2.4.0.dev20240601

Hannibal046 commented 5 months ago

Maybe the precision problem? I am not so sure if apple silicon chip supports bf16. Could you try in a linux distribution and installed the packages as required in the docker file?

Hannibal046 commented 5 months ago

I got the catch--precision. when using torch_dtype=torch.float32, the model could get the answer correctly. I believe this problem is hard to mitigate since all models use half precision for faster inference :(

xerkey commented 5 months ago

I got it. Thanks for verifying! I'll verify on linux distribution when I have a chance.