UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
14.98k stars 2.44k forks source link

The right way of using sentence embeddings for classification #1569

Open zsun9 opened 2 years ago

zsun9 commented 2 years ago

Hello, as a beginner in NLP analysis I am trying to figure out how to use the sentence embeddings of Sbert for a sentence classification task. From the description here, it seems that the sentence embedding results from the pooling of contextualised word vectors from transformers. However, according to the original Bert paper, aren't we supposed to use the hidden vector for the [CLS] token as the representation of a sentence when performing a classification task? Of course this also applies to other tasks, such as whether to use the average of word embedding or the hidden vector for the [CLS] label for topic modelling.

I am not sure whether I miss out or misunderstand anything and grateful if anyone could help me with this issue. Thanks!

nreimers commented 2 years ago

For classification both can be used, CLS or mean pooling.

Have a look at: https://towardsdatascience.com/sentence-transformer-fine-tuning-setfit-outperforms-gpt-3-on-few-shot-text-classification-while-d9a3788f0b4e?gi=96ec1f1da301

How to use SBERT for classification