AkariAsai / self-rag

This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
https://selfrag.github.io/
MIT License
1.76k stars 162 forks source link

The meaning of "_w_gs.jsonl" in evaluation data #58

Open qiweijian opened 6 months ago

qiweijian commented 6 months ago

Thanks for your incredible work!

I notice that there are four files for short form QA in the eval data folder.

I am wondering what 'w_gs' means and more specifically,

  1. how the ctxs field is built.
  2. why the ctxs in popqa_longtail.jsonl and popqa_longtail_w_gs.jsonl differ in numbers and values? (some contexts in popqa_longtail_w_gs.jsonl don't have document id and score)
  3. why triviaqa_test_w_gs_df only has 7313 samples while triviaqa_test.jsonl has 11313? how it is filtered?
AkariAsai commented 6 months ago

Hi! _gs indicates that the retrieved results are further enhanced by Google Programmable Searc in addition to the original contriever top 10 documents. We added this new results in our updated manuscript, Section B.2. That's the reason the number of the contexts differ. The different number of instance seem to be odd, and I may upload incorrect file... Let me double check!

naknak-choi commented 2 months ago

Hi there! I'm so sorry, but where can I find the eval data folder or How can I generate it?