May I know why the CosineSimilarityLoss.py accept the label with range (0,1)?

UKPLab / sentence-transformers

State-of-the-Art Text Embeddings

Apache License 2.0

15.47k stars 2.5k forks source link

Hello!

Great question! There was a similar question on SetFit (a project that finetunes SentenceTransformer models for classification) about this: https://github.com/huggingface/setfit/issues/254 There seem to be two parts to consider:

In theory: a value of -1 means "opposite", whereas 0 means "orthogonal" or unrelated. Generally in Sentence Transformers, we want the train models to give us an idea of similarity/relatedness, and we don't care much for the opposite.
In practice: using ranges of 0 to 1 simply seems to result in higher scores. There's also a lot more training data where the lowest score is "unrelated", rather than "opposite".

I believe you can specify labels in the range of (-1, 1) to the loss, so you're actually free to run some experiments yourself. For example, you can use this STSb training script first normally, and then with the labels remapped to (-1, 1). (Hint: label * 2 - 1 should do it)

Tom Aarsen

UKPLab / sentence-transformers

May I know why the CosineSimilarityLoss.py accept the label with range (0,1)? #2753