The right way of using sentence embeddings for classification

Hello, as a beginner in NLP analysis I am trying to figure out how to use the sentence embeddings of Sbert for a sentence classification task. From the description here, it seems that the sentence embedding results from the pooling of contextualised word vectors from transformers. However, according to the original Bert paper, aren't we supposed to use the hidden vector for the [CLS] token as the representation of a sentence when performing a classification task? Of course this also applies to other tasks, such as whether to use the average of word embedding or the hidden vector for the [CLS] label for topic modelling.

I am not sure whether I miss out or misunderstand anything and grateful if anyone could help me with this issue. Thanks!

UKPLab / sentence-transformers

The right way of using sentence embeddings for classification #1569