AkariAsai / self-rag

This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
https://selfrag.github.io/
MIT License
1.76k stars 160 forks source link

question about multi content reference #88

Open 256785 opened 1 month ago

256785 commented 1 month ago

In the process of training generation model,the traindata contains one question,one or zero recall passage,answer and some special tokens. During inference,I wonder if a question need to get messages from several passages,how to aggregate the answers?

256785 commented 1 month ago

for example,question is talking about UK. Passage one is about UK food, Passage two is about UK history.During self-rag inference,I would get only one answer about UK food or history,but not all messages,is it?

fate-ubw commented 1 month ago

During self-rag inference,In the first stage selfrag will generate top-k candidate answers, then selfrag will rank all candidate answers by score calculated by special tokens. As a result, selfrag only outputs one final answer. But the situation is different in long form inference by of the beam search mechanism. I have rewrite the selfrag algorithm, providing clearer and more concise code, which has been integrated into the library RAGLAB. For your inquiries, you can refer to this part of the code at https://github.com/fate-ubw/RAGLAB/blob/main/raglab/rag/infer_alg/self_rag_reproduction/selfrag_reproduction.py#L137