Code for reproducing the SIGIR '22 paper

Zero-shot Query Contextualization for Conversational Search

Our approach (ZeCo²) contextualizes the last user question within the conversation history, but restrict the matching only between question and potential answer.

For more details, check our:

SIGIR'22 poster | paper

@inproceedings{krasakis-2022-zeroshot,
    author = {Krasakis, Antonios Minas and Yates, Andrew and Kanoulas, Evangelos},
    booktitle = {SIGIR 2022: 45th international ACM SIGIR Conference on Research and Development in Information Retrieval},
    month = {July},
    publisher = {ACM},
    title = {Zero-shot Query Contextualization for Conversational Search},
    year = {2022}}

Reproducing results:

Install colbert
Download colbert model checkpoint & update various paths (see the #TODO @ paths.py )
Corpus indexing: create FAISS indexes using a ColBERT model (see [index your collection](README_ColBERT.md#ColBERT Indexing). Note that preprocessing scripts are available in /preprocessing. To make use our pipeline for retrieval and evaluation, you need to convert query&passage ids to integers and retain a mapping file (.intmapping) before indexing (see the preprocessing README and examples).
Retrieve & rerank using the available pipeline:

python_pipeline.py --setting ZeCo2 --dataset cast19

The ColBERT checkpoint used (trained for 400K steps) is available here

Paper analysis section:

The two scripts used to reproduce the analysis section of the paper are:

token_embedding_change.py
embedding_closest_terms.py

You can already run the analysis since the final rankings are provided under data/rankings/

For ColBERT-related questions, instructions, etc. please refer to the original repository (forked from v0.2.0) or README_ColBERT.md, or feel free to raise an issue!

littlewine / ZeCo2

readme

Zero-shot Query Contextualization for Conversational Search

Reproducing results:

Paper analysis section: