Keywords

Contrastive learning, ANCE

TL;DR

Abstract

Conducting text retrieval in a dense representation space has many intriguing ad- vantages. Yet the end-to-end learned dense retrieval (DR) often underperforms word-based sparse retrieval. In this paper, we first theoretically show the learning bottleneck of dense retrieval is due to the domination of uninformative negatives sampled locally in batch, which yield diminishing gradient norms, large stochastic gradient variances, and slow learning convergence. We then propose Approximate nearest neighbor Negative Contrastive Learning (ANCE), a learning mechanism that selects hard training negatives globally from the entire corpus, using an asyn- chronously updated ANN index. Our experiments demonstrate the effectiveness of ANCE on web search, question answering, and in a commercial search envi- ronment, showing ANCE dot-product retrieval nearly matches the accuracy of BERT-based cascade IR pipeline, while being 100x more efficient.

Paper link

https://bit.ly/3MhKigv

Presentation link

https://docs.google.com/presentation/d/1-7xLTaJqKNgGNLqNVUsgXgWSxPN_mCpG/edit?usp=sharing&ouid=101655033362467115643&rtpof=true&sd=true

video link

https://youtu.be/ZF9Q-jdb_rE

jwkanggist / SSL-narratives-NLP-1