Query Expansion by Prompting Large Language Models

Title

Team Name

IRFighters

Email

202101222@daiict.ac.in

Team Member 1 Name

Priyesh Tandel

Team Member 1 Id

202101222

Team Member 2 Name

Keertivardhan Goyal

Team Member 2 Id

202103007

Team Member 3 Name

Yash Mashru

Team Member 3 Id

202103045

Team Member 4 Name

Sanchit Satija

Team Member 4 Id

202103054

Problem Statement

Users often input short queries, because of that sometimes traditional information retrieval methods like BM25 because of it's nature to find exact match, it could not retrieve some relevant documents. which decreases the recall. To improve this we will be using LLM for query expansion using different prompt methods shown in the paper below. Specifically we try to investigate the effectiveness of Flan-T5-Small over Single BM25 alone, whose Implementation of the model is available by PyTerrier.

The paper has shown results for large model of LLM Flan-UL2 (20B parameters) also, but this might not work on our computer, so we will go with Flan-T5-Small (60M parameters).

Overview

Evaluation Strategy

For evaluation we will use metric below. (Also mentioned in paper) Recall MRR@10 NDCG@10

we are evaluating BM25 alone, BM25 with PRF(Pseudo relevance feedback), and BM25 with LLM and comparing them. There are different variations in the paper.

Dataset

We will be using these datasets 1) MS MARCO 2) BEIR Dataset.

Resources

Query Expansion by Prompting Large Language Models- https://arxiv.org/pdf/2305.03653

parth126 / IT550