Models w/Contrastive Learning Objective

UKPLab / sentence-transformers

State-of-the-Art Text Embeddings

https://www.sbert.net

Apache License 2.0

15.22k stars 2.47k forks source link

Models w/Contrastive Learning Objective #1202

Open PrithivirajDamodaran opened 3 years ago

PrithivirajDamodaran commented 3 years ago

Thank you for your awesome work in the sentence embedding space !

I would like your help on couple of questions:

Please point out the models which are fine-tuned using contrastive learning loss as opposed to cross-entropy loss under the sentence transformer models.
Also what's the difference between the Anchor, Positive , Negative based Triplet objective function used in the original S-BERT paper and contrastive objective..

Thanks in Advance

nreimers commented 3 years ago

None of the uploaded models have been trained with contrastive learning, as it performs rather poorly compared to MultipleNegativesRankingLoss

https://www.sbert.net/docs/package_reference/losses.html#contrastiveloss You have pairs and a label, that indicate if the pair is positive (and should be close in vector space) or if it is negative (and should be far away in vector space)

PrithivirajDamodaran commented 3 years ago

Thank you ! Got it, but for all practical purposes using an anchor, positive and negative pairs still falls under the rubric of "contrastive learning" as we show pairs for the model to learn a metric... say a similarity metric even though the learning objective is slightly different compared to the vanilla contrastive loss..

..and so all models are trained with MultipleNegativesRankingLoss ?

nreimers commented 3 years ago

The most recent msmarco models have been trained with MarginMSE loss, the other (most recent) models with MultipleNegativesRankingLoss

PrithivirajDamodaran commented 3 years ago

Thanks a lot

One last question..is there a place where I can see which models are trained with which loss functions somewhere in the repo or in the sbert site ?

nreimers commented 3 years ago

The newer models have a Train_script.py in the git of the model hub uploaded, which is the code that was used to train the respective model