Closed mfayoub closed 2 years ago
Hi @mfayoub
msmarco-v1-passage
is the index you are looking for I guess. (which has 8.8 million passages, i.e. the original version of msmarco passage ranking).
msmarco-v2-passage
is a new corpus released in 2021, https://microsoft.github.io/msmarco/TREC-Deep-Learning-2021.html#please-read-data-refresh
Thanks MXueguang for your reply! Yes, I tried msmarco-v1-passage, and it seems returning the expected doc_ids.
Hi everyone,
I've downloaded the prebuilt index named "msmarco-v2-passage", and then I tried a simple search. The resulting passages have a different format that what I was expecting. So, a sample passage id that came out of my search is like msmarco_passage_04_180136318, where I expected a passage id is just a number (from 1 till 8.8 million). Is that right? or am I doing something wrong?