facebookresearch / contriever

Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning
Other
684 stars 60 forks source link

Script finetuning on MSMarco #9

Open gangiswag opened 2 years ago

gangiswag commented 2 years ago

Thanks a lot for releasing the code and the scripts for pre-training.

I'm trying to reproduce the numbers on MS-Marco after fine-tuning and it would be great if you could also release the scripts for fine-tuning.

Specifically, I had questions about training the model after mining the hard negatives.

Is it initialized to the pre-trained contriever model or the contriever model fine-tuned with random negatives?

gizacard commented 2 years ago

Hi! I've uploaded the script I used for finetuning here https://github.com/facebookresearch/contriever/blob/main/finetuning.py. There is no support for the ASAM optimizer that I used to finetune the English model. I'll try to add an example script when I'll have more time. For hard negative mining, I first train contriever on supervised data, then mine hard negatives, then retrain the model with these hard negatives. Also for hard negative mining I did it in a pretty manual way, which makes it hard to write a single script pipelining all the different actions.