Closed lintool closed 3 years ago
How about we make current QueryEncoder
as abstract class. And make sub classes TCTColBERTQueryEncoder
and DPRQueryEncoder
.
The
DPRQueryEncoder
wrap this https://huggingface.co/transformers/model_doc/dpr.html#transformers.DPRQuestionEncoder
Yes, I think this is the right approach, although TCTColBERTQueryEncoder
looks really ugly. I don't have any better suggestions though.
Aside, this also means that at some point in time we need to build sparse indexes for the Wikipedia collection used in DPR.
Ref: #325 - code merged!
@MXueguang We need a replication guide for this also...
Currently, we have: https://github.com/castorini/pyserini/blob/master/docs/dense-retrieval.md
Would it make sense to break into:
dense-retrieval-msmarco-passage.md
dense-retrieval-msmarco-doc.md
dense-retrieval-dpr.md
Thoughts?
yes, for msmarco-doc: we'll do that after we finish the msmarco-doc experiment for dpr, I guess we need to evaluate the result by the downstream qa evaluation?
for msmarco-doc: we'll do that after we finish the msmarco-doc experiment
Yup.
for dpr, I guess we need to evaluate the result by the downstream qa evaluation?
No, let's focus on only the retriever stage. The architecture is retriever-reader, right? And the DPR paper gives component effectiveness of only the retriever stage. Let's try to match those numbers.
How do we deal with the DPR retrieval evaluation? since the evaluation is different from regular IR tasks. i.e. evaluate by qrels two solutions:
Let's do (1) for now and just check in the official DPR eval script, just like we've checked in the MS MARCO scripts. Might want to put into tools/
so PyGaggle can also use, right @ronakice ?
emmm, I don't think they have an official "script" to evaluate. They wrap the evaluation inside their retrieval functions here. I am evaluating with the script written by myself.
with my script, I am getting:
Top20: 0.7794906931597579
Top100: 0.8460660043393856
Theirs are:
Top20: 0.784
Top100: 0.854
a bit lower, but I am using hnsw index rn. will evaluate on bf index next
will continue the discussion about replication result in https://github.com/castorini/pyserini/issues/336
We can fold in all the DPR collections into Pyserini, so we can do the retriever part of a QA system directly in Pyserini.