Reproducing CoT-MAE results on NQ

caskcsg / ir

ConTextual Mask Auto-Encoder for Dense Passage Retrieval

Apache License 2.0

35 stars 3 forks source link

Sorry for the late reply. The issue raised in our organization repo does not give me a notification.

Following Condenser, we use DPR to train and test on NQ.

Two-stage pipelines are used:

Stage 1: Train with BM25 negatives.

Stage 2: Train with BM25 negatives + hard negatives mined from CoT-MAE stage 1 retriever.

You can also refer to here(https://github.com/texttron/tevatron/blob/main/examples/coCondenser-nq/README.md) for pipeline instructions, but we use the hard negatives mined from CoT-MAE stage 1 retriever, rather than the negatives provided by coCondenser.

caskcsg / ir