staoxiao / RetroMAE

Codebase for RetroMAE and beyond.
Apache License 2.0
227 stars 16 forks source link

How to finetune with ANCE? #10

Open Victoriaheiheihei opened 1 year ago

Victoriaheiheihei commented 1 year ago

Hellow, Is that the code publiced in code is for finetuning the model by dpr? Where can i find the code for finetuning with ANCE? And there's another question that confused me. Is that the code in code is used to distill the retriever with a teacher model,the reault corresponding to (0.416/0.709/0.927/0.988)

staoxiao commented 1 year ago

Thanks for your interest in RetroMAE!

We fine-tune the model with hard negatives by changing the argument neg_file (noted we don't change the hard negatives dynamically in the training process like the original ANCE). You can use our provided hard_negs.txt to reproduce the results or generate new hard negatives following our commands.

The cross-encoder example will fine-tune a teacher model, whose prediction scores will be used in distillation. To distill the retriever, you need to generate the teacher_score_files by cross-encoder and add the argument to the training command in the bi_encoder example.

Victoriaheiheihei commented 1 year ago

Thank you for your reply! I tried to finetune the ANCE model (using the hard_negs.txt on Shitao/RetroMAE_MSMARCO),but the results is lower than the reported one. Do you finetune the ANCE model on the Shitao/RetroMAE_MSMARCO) or on the model finetuned by DPR? I trained the model with batch_size 32 epoch 8 as mentioned in paper (but the batch_size and epoch in the script is 16 and 4, I'm not sure which group of parameter is used in finetuning)

staoxiao commented 1 year ago

For ANCE, we finetune the Shitao/RetroMAE_MSMARCO model. Please use the hyper-parameters in our script, which we found is better.

Victoriaheiheihei commented 1 year ago

Thank you. I will try again