Closed TheMrSheldon closed 4 months ago
Thanks for the report! It seems the links have changed very recently, per the tracking here: https://ir-datasets.com/downloads
Version 0.5.6
updates the URLs, so you should be good to go* after updating!
pip install --upgrade ir-datasets==0.5.6
* Should work for everything except msmarco-qna
-- which seem to have been removed completely. I've opened an issue about it.
Thank you very much for your hard work!
I see that the msmarco-qna
issue is resolved as well and consider this resolved
Describe the bug The download-URLs for MSMARCO seem to have been moved to a new domain. For example the test queries for TREC DL '19 Passage where previously at
https://msmarco.blob.core.windows.net/msmarcoranking/msmarco-test2019-queries.tsv.gz
and can now be found athttps://msmarco.z22.web.core.windows.net/msmarcoranking/msmarco-test2019-queries.tsv.gz
(see also here: https://microsoft.github.io/msmarco/TREC-Deep-Learning-2019.html).From a precursory glance, I believe that replacing
https://msmarco.blob.core.windows.net/
withhttps://msmarco.z22.web.core.windows.net/
inir_datasets/etc/downloads.json
should fix this.Affected dataset(s) At least
msmarco-passage
.To Reproduce Access
msmarco-passage/train
.