The meaning of "_w_gs.jsonl" in evaluation data

qiweijian commented 6 months ago

Thanks for your incredible work!

I notice that there are four files for short form QA in the eval data folder.

popqa_longtail.jsonl
popqa_longtail_w_gs.jsonl
triviaqa_test.jsonl
triviaqa_test_w_gs.jsonl

I am wondering what 'w_gs' means and more specifically,

how the ctxs field is built.
why the ctxs in popqa_longtail.jsonl and popqa_longtail_w_gs.jsonl differ in numbers and values? (some contexts in popqa_longtail_w_gs.jsonl don't have document id and score)
why triviaqa_test_w_gs_df only has 7313 samples while triviaqa_test.jsonl has 11313? how it is filtered?

AkariAsai commented 6 months ago

Hi! _gs indicates that the retrieved results are further enhanced by Google Programmable Searc in addition to the original contriever top 10 documents. We added this new results in our updated manuscript, Section B.2. That's the reason the number of the contexts differ. The different number of instance seem to be odd, and I may upload incorrect file... Let me double check!

naknak-choi commented 2 months ago

Hi there! I'm so sorry, but where can I find the eval data folder or How can I generate it?

AkariAsai / self-rag

The meaning of "_w_gs.jsonl" in evaluation data #58