having problem reproducing generation accuracy result

yixuantt / MultiHop-RAG

Repository for "MultiHop-RAG: A Dataset for Evaluating Retrieval-Augmented Generation Across Documents" (COLM 2024)

214 stars 15 forks source link

having problem reproducing generation accuracy result #14

Closed songsey closed 1 month ago

songsey commented 1 month ago

I want to reproduce table 6 (generation accuracy). Can you tell me how to do this with current code ?

I'm particularly interested in openai gpt results. GPT4 seems to be quite a high baseline already, for "inference query" type questions, w/o RAG approach.

To do this, do you need to provide openai version of "qa_llama.py ?"

yixuantt commented 1 month ago

Hi， for Openai, you can just replace the generation function with the Openai'chat API call.

songsey commented 1 month ago

ok, can you tell me why qa_llama.py doesn't generate response using transformers pipeline ? for example, "pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, ) " qa_llama.py doesn't seem to extend to other models if we use model.generate method.

yixuantt commented 1 month ago

Yes. You can also use that method. qa_llama.py is just an example for generating answers.

songsey commented 1 month ago

I see. lastly, it will be good if I can do retrieval and generation at once. I need to retrieve first to output and then generation right ?

yixuantt commented 1 month ago

Actually, you can adapt the process to suit your needs. Since various RAG frameworks employ different retrieval methods, this repository serves merely as a basic demonstration of how to utilize this dataset, which is also featured in the paper showcasing the performance of a simple RAG framework.

In summary, you can combine the two processes if applicable. And I am also looking forward to seeing the new RAG framework. 😊