-
Hi @vlad-karpukhin ,
I wanted to replicate the BM25 retrieval results mentioned in Table 2 in the DPR paper. When I read footer 8 -
> Lucene implementation. BM25 parameters b = 0.4 (document le…
-
**Describe the bug**
This commit (https://github.com/microsoft/msmarco/commit/41b3a684ed8ebd4e753250c3687547a77c62e7dd) updated the qrel files & changed the md5 hashes, now the download of the qrels …
-
The element description of [``](https://music-encoding.org/guidelines/dev/elements/exhibHist.html) states that it encodes:
> A record of public exhibitions, including dates, venues, etc.
The ele…
-
As your paper has noted, you used L2 similarity during end to end retrieval, but in your code index_ranker->rank(), in the second stage of end-to-end retrieval, you rerank the passages with cosine si…
-
Hello,
I have read your paper and am quite interested in your work! There is a question about the tokens.
I notice you truncat the passage tokens with 120 in MSMARCO Passage Retrieval, however, fo…
-
I want to use the dataset in "full-mode" and I am trying to create the knowledgebase. So could you please let me know which JSON file I should use?
Also please let me know what are "newsdial-XXX"…
-
Hi,
The integration for our Margin-MSE checkpoint was super smooth - thanks so much for that :) So I am back with a new model: I just published our TAS-Balanced trained, DistilBERT-based checkpoint…
-
Can we remove these scripts now that these features have been integrated into pyserini?
Okay from pygaggle perspective also right?
cc @KaiSun314
-
Hi! Thanks for your work in this repo!
I was trying to reproduce bi/cross encoder in ms marco dataset. However, there are some questions that confuse me:
1) I saw that you mentioned in the descript…
-
Hi, I'm replicating [transformer-kernel-ranking](https://github.com/sebastian-hofstaetter/transformer-kernel-ranking) which is a SOTA model in information retrieval task. And when I read README.md of …