UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
14.83k stars 2.44k forks source link

STS training, score normalization [0,1] vs [-1,1] #393

Closed cccntu closed 4 years ago

cccntu commented 4 years ago

https://github.com/UKPLab/sentence-transformers/blob/cfd4e3d4d4ac38f2d06438af783f36c94a571bd1/examples/training/sts/training_stsbenchmark.py#L68

The scores are normalized to [0,1], but cosine similarity is in the range of [-1, 1]. Is this a bug, or is this intended?

nreimers commented 4 years ago

It tested both back in 2019 and normalizing to 0...1 worked a bit better.

But you could also test easily both and see what works better.