caskcsg / ir

ConTextual Mask Auto-Encoder for Dense Passage Retrieval
Apache License 2.0
35 stars 3 forks source link

Details about reranker training #3

Closed jordane95 closed 1 year ago

jordane95 commented 1 year ago

Hi @ma787639046, thanks for opensouring the interesting work!

I have some questions about the reranker training details.

In your paper, the reported the reranker MRR@10 is 43.3. While in the hf model card, the reported MRR@10 is 43.9.

Did you use different base model for initialization? Or different hyperparameters? Would you mind sharding more training details because I didn't find it in your paper nor in the repo...

Thanks!

ma787639046 commented 1 year ago
  1. About the version of re-ranker Hi, the released re-ranker was used for MS-Marco Passage Ranking Leaderboard. We incorporated R-Drop into the re-ranker training phase for competing for a higher score on the leaderboard, and thus achieved a higher score MRR@10 43.9. Except for the incorporation of R-Drop, it shares the same settings as the re-ranker used in the paper.
  2. Training details of re-ranker The Reranker is a simple BERT + classification head, such as BertForSequenceClassification. I use a listwise cross-entropy loss on top of BertForSequenceClassification as the loss function. The re-ranker is initialized from cotmae-base-uncased. Negatives are mined from cotmae stage2 retriever (cotmae_base_msmarco_retriever). The retrieval ranks that re-ranker perform rerank(re-score) on are also mined from stage2 retriever. I use lr=4e-5, epoch=2, num_of_passage=64 to train this reranker. This re-ranker can achieve MRR@10 43.3.
jordane95 commented 1 year ago

Thanks for the clarification! That's very helpful!