UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.29k stars 2.48k forks source link

Question: loading custom BERT model and cosine similarity #310

Closed mirandrom closed 4 years ago

mirandrom commented 4 years ago

Hello, thanks for the great library. If I load a custom BERT model I fine-tuned on text classification (i.e. the cosine similarity of embeddings does not necessarily mean anything), does your library have any functionality to make the cosine similarity more meaningful? Or am I missing something, and would the cosine similarity here actually be representative of semantic similarity? Thank you!

nreimers commented 4 years ago

does your library have any functionality to make the cosine similarity more meaningful? There is sadly no magic which would enable this. To make cosine similarity more meaningful, you need some data that indicates which sentences are considered similar and which pairs are considered dissimilar.

But you can try to use cosine similarity. Depending on your text classification task, it might work. For example, the models here are first fine-tuned on NLI, which is also a text classification task with 3 labels. The embeddings resulting from this are quite good.

mirandrom commented 4 years ago

I see, thanks for the prompt response!