spacemanidol / MSMARCO

Utilities, Baselines, Statistics and Descriptions Related to the MSMARCO DATASET
MIT License
190 stars 41 forks source link

Regarding the Test Set for Q&A #38

Closed tahmedge closed 4 years ago

tahmedge commented 4 years ago

Hi,

In your website, you have, training, dev, and evaluation set for Q&A. I want to know, what is the difference between dev-set and evaluation-set? And where can I find the test-data to submit the result for the leaderboard?

spacemanidol commented 4 years ago

@tahmedge eval is the test set. Eval is held out for blind evaluation

tahmedge commented 4 years ago

It is mentioned that we need to "Run the evaluation script on the test/dev set and generate the output results file for submission." Now, since the test data doesn't contain the answers, what would be the reference file to run the evaluation script? Also, how to submit the result only for the well-formed version?