castorini / ura-projects

0 stars 1 forks source link

Build out cosDPR-distil regressions for TREC 2019 and TREC 2020 for Anserini #12

Closed lintool closed 7 months ago

lintool commented 9 months ago

Here is a concrete task. If you look at our Anserini regressions, under "MS MARCO V1 Passage Regressions", we have missing entries for TREC 2019 and TREC 2020.

Screen Shot 2023-09-15 at 7 26 58 AM

The concrete task is to build these regressions.

We need to encode the queries and then convert them into Anserini's JSON format. @MXueguang might actually have them encoded already (in numpy?)... in which case we just have to convert them over.

Warmup tasks:

(1) Reproduce cosDPR-distil on MS MARCO: https://github.com/castorini/anserini/blob/master/docs/regressions/regressions-msmarco-passage-cos-dpr-distil.md - make sure you can get it running on student linux env. (2) To understand the context of what you're doing, read: https://cs.uwaterloo.ca/~jimmylin/publications/Ma_etal_CIKM2023.pdf

This is related to #3 - @pratyushpal and @mchlp you might be interested.

pratyushpal commented 9 months ago

@lintool I'll work on it!

pratyushpal commented 9 months ago

This task was solved in this PR : https://github.com/castorini/anserini/pull/2204

lintool commented 7 months ago

This is done, closing.