Document Ranking with a Pretrained Sequence-to-Sequence Model

pratham4521 commented 2 months ago

Title

Team Name

Team DSSM

Email

202311022@daiict.ac.in

Team Member 1 Name

Pratham Patel

Team Member 1 Id

202311022

Team Member 2 Name

Nishit Munjani

Team Member 2 Id

202311026

Team Member 3 Name

Rohit Rathod

Team Member 3 Id

202311039

Team Member 4 Name

Ayushi Mehta

Team Member 4 Id

202311008

Problem Statement

The project is aimed to re-rank the documents using pre-trained models like T5.

Evaluation Strategy

AP, Precision, NDCG

Dataset

Robust - https://trec.nist.gov/data/robust/04.guidelines.html

Resources

Document Ranking with a Pretrained Sequence-to-Sequence Model Arxiv : https://arxiv.org/abs/2003.06713

parth126 commented 2 months ago

Unclear if the proposed dataset is usable for the given task.
Suggested to verify the following: a. Understand the dataset and how relevance is computed for it. b. Do a feasibility test (by comparing existing implementation with tf-idf for example) to ensure this project is actually viable

pratham4521 commented 2 months ago

Respected sir, due to the large size of standard datasets we did not have enough time to train pre-implemented models on these datasets. Therefore we decided to change the topic for our project.

parth126 commented 2 months ago

Include the reference paper in the proposal. Task is reranking using T5 on Trec robust and MS Marco datasets.

parth126 / IT550