texttron / tevatron

Tevatron - A flexible toolkit for neural retrieval research and development.
http://tevatron.ai
Apache License 2.0
494 stars 94 forks source link

Contrastive pre-training with InfoNCE loss #120

Closed yurinoviello closed 3 months ago

yurinoviello commented 4 months ago

I am trying to reproduce (with some differences), the results obtained for the e5 models family.

The second stage fine-tuning is perfectly reproducible with this repo (I am using the v2)

However, for the contrastive pre-training, I wanted to change the cross-entropy loss of the EncoderModel to the InfoNCE loss. This should be enough, right?

Also, it is not clear to me about how perform only in-batch negatives tuning with this repo. I did not found any option on the trainer, and when I have examples without negative_passages I obtain an error.

Thank you.

ArvinZhuang commented 4 months ago

set --train_group_size to 1 should make the training to be in-batch negatives only.