UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
14.83k stars 2.44k forks source link

Usage of vectors #1313

Open arita37 opened 2 years ago

arita37 commented 2 years ago

One question:

After encoding a list of sentences.

does it make to average the embedding of the sentence to form a single one ? Context Have a list of sentences per thema, and need to calculate a vector for the thema, and compare with other sentences.

also. Nb of document changes over time, So needs simple method….

nreimers commented 2 years ago

Yes, you can do this.

arita37 commented 2 years ago

Thanks. To understand a bit the background, It is not advisable to average the word BERT vectors, but averaging sentence level embeddinf, this is fine ?

Thanks.

nreimers commented 2 years ago

The models also average the word vectors to create the representation for the complete text.

What is important is to use a tuned model, that was trained for this task, and not to use BERT out of the box