This issue isn't a problem to be fixed, rather it is a record to keep track of undergraduate student's work on TREC 2024. In a series of comments, contributors can write about their work (at a high level) on specific tracks. This issue will be regularly updated upon finishing certain tasks/tracks.
Assisted in creation of a f-stage, gte-qwen2 dense retrieval baseline. Assisted in writing scripts for encoding corpus and queries, and then running retrieval with aforementioned embeddings.
Created BM25 document translation and query translation baselines. Indexed corpus, ran baselines, and evaluated all created runs.
Created a SPLADE document translated baseline. Encoded the corpus and queries using splade model, indexed corpus embeddings, ran the baseline, and evaluated results. This baseline was not used in submissions, given it's surprisingly low eval scores.
Created and ran several data munging scripts to reformat queries and corpus.
Task: Multilingual Retrieval (MLIR)
Created and used multiple reformatting scripts, the most notable one being converting a trec run into retrieval results format.
Created SPLADE, BM25-dt, BM25-qt, and PLAID baselines for task (basically all f-stage baselines). Ran all baselines, evaluated runs, RRFed runs into fusion runs, and then evaluated fusion runs. Fusion runs were sent off to mono stage.
Fused (with RRF) and evaluated post-mono and post-listo runs.
Task: Cross-language Retrieval (CLIR)
Taking post-mono fused runs from MLIR, I combined the zho, rus, and fas runs to create a top 300 retrieval results run for CLIR. This new retrieval results file was then sent off to list-wise reranking.
Task: Cross-Language Report Generation
Created SPLADE, BM25-dt, and PLAID baselines for task (all f-stage baselines). Ran all baselines, evaluated runs, RRFed runs into fusion runs, and then evaluated fusion runs. Fusion runs were sent off to mono stage.
This issue isn't a problem to be fixed, rather it is a record to keep track of undergraduate student's work on TREC 2024. In a series of comments, contributors can write about their work (at a high level) on specific tracks. This issue will be regularly updated upon finishing certain tasks/tracks.