Open GiacoL opened 1 month ago
Hi @GiacoL There's another zip file for train set - "nq-train" All the available datasets - https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/
https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/,this url contains nq dataset,but I didn't find train.csv in nq.zip
I downloaded the NQ dataset and the tsv file for the train set appears to be missing
please,where did you finally download train.csv?
Thank you very much!
------------------ 原始邮件 ------------------ 发件人: "beir-cellar/beir" @.>; 发送时间: 2024年8月15日(星期四) 上午10:31 @.>; @.**@.>; 主题: Re: [beir-cellar/beir] NQ - File datasets/nq/qrels/train.tsv not present (Issue #179)
Hi @Gerry-j https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/nq-train.zip
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Hi all, thanks for this info. Is the corpus set not the same between them? I see 18,060,996
lines in the corpus for this link but the BEIR NQ corpus for test has 2,681,468
? Perhaps the train has the unfiltered corpus while the test has the filtered version.
It seems like the qrels for train have documents up to 18 million also, so it appears one would have to index the train corpora separately to use these.
I downloaded the NQ dataset and the tsv file for the train set appears to be missing