AlexTMallen / adaptive-retrieval

MIT License
157 stars 8 forks source link

Questions about PopQA’s entity source and sampling methods #8

Open yiming-zh opened 4 months ago

yiming-zh commented 4 months ago

Hello, I read your article and found it very interesting. I'm currently doing research around your work but having trouble.

I have the following two questions:

AlexTMallen commented 4 months ago

Hi! We forgot to include a citation there for the C4 dataset - can be found at this link - please use the english subset. The first 800 mb random sample is the one we used.

Here is the code for dataset creation. m2rqa.db is from the ROMQA dataset.