princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
MIT License
3.31k stars 502 forks source link

[question] Pretrined sentence embeddings model fine tuning #265

Closed nurulchamidah closed 7 months ago

nurulchamidah commented 7 months ago

Hello, thanks for your great and inspiring works . I would like to ask, is fine tuning in supervised manner of sentence embeddings that is produced using unsupervised learning or other way round possible? OR pretrained sentence embeddings from model produced from supervised triplets (with hard negatives), fine tuned using non triplet (example : entailment pair) in another dataset. is this possible? thanks before.

yaoxingcheng commented 7 months ago

Hi, thanks for your attention. That's an interesting idea, and I believe it's possible. For your reference, a recent work BGE is first tuned with large-scale automatically extracted sequence pairs and then tuned on high-quality datasets like NLI datasets.

nurulchamidah commented 7 months ago

oke. thank you for your answer and reference. it helps me a lot.