jingtaozhan / DRhard

SIGIR'21: Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track.
BSD 3-Clause "New" or "Revised" License
125 stars 14 forks source link

Training setup of ANCE and STAR #21

Open ranonrkm opened 2 years ago

ranonrkm commented 2 years ago

Hi, thank you for publishing the code for your interesting paper. I was just trying to reproduce STAR results in ANCE setup, i.e. I am using static hard negatives and in-batch negatives. But I am unable to achieve an MRR@10 score of 0.34. Also, the STAR checkpoint provided in this repo is not producing MRR@10 result of 0.34 when evaluated using ANCE repo. I am getting MRR@10 of 0.299 instead. I see there are some differences in the training setups in your repo and the ANCE one. Can you please highlight those?

jingtaozhan commented 2 years ago

I am getting MRR@10 of 0.299 instead.

This is indeed not expected. Maybe you can check whether your loaded model is correct. Run md5sum pytorch_model.bin and you should get aee57f170b7a3334a225c35cfba0a122. Also, could you follow the inference instructions in our readme and see whether you could get 0.34?