UKPLab / sentence-transformers

Multilingual Sentence & Image Embeddings with BERT
https://www.SBERT.net
Apache License 2.0
14.53k stars 2.41k forks source link

What does "unsupervised STS" mean? #428

Open empty-id opened 3 years ago

empty-id commented 3 years ago

How could SBERT (before fine-tuning) work without supervision? I don't quite understand the content in the paper section 4.

I mean, is it fair to compare BERT without fine-tuning and SBERT after fine-tuning with labeled data?

nreimers commented 3 years ago

"Unsupervised STS" means without any explicit training data for STS. But other training data can be used.

"Supervised STS" means with training data for STS.

If you have training data for STS, the results for STS are of course better than having not explicit training data.

empty-id commented 3 years ago

Yeah, sorry I know where I am wrong. The triplet loss is actually can fit unsupervised training, or more precisely, self-supervised training. Is it right?

However, how to promise the "non-explicit training data" is independent to the evaluation data in "Unsupervised STS"? In my opinion, "without explicit training data" may refer to semi-supervised learning (or transfer learning?) instead of unsupervised learning?