AkariAsai / self-rag

This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
https://selfrag.github.io/
MIT License
1.67k stars 147 forks source link

Why is retrieval dependent on only the previous span? #7

Closed nightlessbaron closed 9 months ago

nightlessbaron commented 9 months ago

Hi, why is the input to the retrieval (x + y(t-1)) and not (x + y(<t)) Also, a minor correction in the file name: requirementd.txt -> requirements.txt

image
AkariAsai commented 9 months ago

We use the original input as well as the previous generation as a retrieval query (an input to the retrieval model) since if we use all previously generated sentences y_{<t}, the retrieved results are more biased towards earlier generations e.g., y_1, which might not be closely related to y_t, especially our generation gets longer. Figure 6 in In-Context Retrieval-Augmented Language Models also reports that once our retrieval queries get longer the model performance starts deteriorating. Note that our decision on whether we should retrieve is based on y<t, as indicated in the Table!

nightlessbaron commented 9 months ago

I see, thanks a lot for sharing this :D