Hi @ma787639046, thanks for opensouring the interesting work!
I have some questions about the reranker training details.
In your paper, the reported the reranker MRR@10 is 43.3. While in the hf model card, the reported MRR@10 is 43.9.
Did you use different base model for initialization? Or different hyperparameters? Would you mind sharding more training details because I didn't find it in your paper nor in the repo...
About the version of re-ranker
Hi, the released re-ranker was used for MS-Marco Passage Ranking Leaderboard. We incorporated R-Drop into the re-ranker training phase for competing for a higher score on the leaderboard, and thus achieved a higher score MRR@10 43.9. Except for the incorporation of R-Drop, it shares the same settings as the re-ranker used in the paper.
Training details of re-ranker
The Reranker is a simple BERT + classification head, such as BertForSequenceClassification. I use a listwise cross-entropy loss on top of BertForSequenceClassification as the loss function.
The re-ranker is initialized from cotmae-base-uncased. Negatives are mined from cotmae stage2 retriever (cotmae_base_msmarco_retriever). The retrieval ranks that re-ranker perform rerank(re-score) on are also mined from stage2 retriever. I use lr=4e-5, epoch=2, num_of_passage=64 to train this reranker. This re-ranker can achieve MRR@10 43.3.
Hi @ma787639046, thanks for opensouring the interesting work!
I have some questions about the reranker training details.
In your paper, the reported the reranker MRR@10 is 43.3. While in the hf model card, the reported MRR@10 is 43.9.
Did you use different base model for initialization? Or different hyperparameters? Would you mind sharding more training details because I didn't find it in your paper nor in the repo...
Thanks!