beir-cellar / beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
http://beir.ai
Apache License 2.0
1.54k stars 182 forks source link

NQ training qrels missing #108

Closed cadurosar closed 1 year ago

cadurosar commented 1 year ago

Hey, I was looking into the training sets contained in BEIR, and it seems that the NQ one is missing. The the one provided in the current link only has the test set (https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/nq.zip). Could you help me with this?

thakur-nandan commented 1 year ago

Hi @cadurosar,

The nq folder provided only contains the test set. For nq training qrels you can have a look into https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/nq-train.zip.

The reason for this is because the corpus used for both training and test is different. Training set contains passage duplicates which are suitable for training, however not for validation.

cadurosar commented 1 year ago

Thanks @thakur-nandan!