Closed davidmrau closed 2 years ago
I would echo this request since I cannot reproduce the BM25 (Elasticsearch, default setting) results on certain datasets, particularly the ones with fewer docs like NFCorpus and SciFact.
Besides, I found that the scores may change a bit in different runs. Is there any randomness inside Elasticsearch? Maybe @thakur-nandan can provide some insights? Thanks!
First of all, thanks for this amazing benchmark. I'd like to evaluate a re-ranking model on several datasets. If I got I correctly, I will have to download and index all datasets independently to get the top-100 BM25 rank lists. Could you please provide those for each dataset for easier evaluation?
Thanks a lot!