UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.19k stars 2.47k forks source link

Bert nli model is giving bad cosine scores #558

Open ajinkya2903 opened 3 years ago

ajinkya2903 commented 3 years ago

Hey, @nreimers . I have finetuned distilbert-nli-mean-tokens on my custom data. It is giving embedding for every input sentence pairs. But it is giving high cosine for irrelevant sentences. I am not getting why it is giving such high cosine scores for such pairs. Can you suggest to me some solution for this? image

Thanks

nreimers commented 3 years ago

0.6 is not that high. But the results depend on your training data and the chosen loss function.

ajinkya2903 commented 3 years ago

I chose ContrastiveLoss as the loss function and my training data was in a considerable amount. Does my loss function selection is making a difference here. I know 0.6 is not that high but for some other examples it is giving like 0.8 cosine score. Is there any way to fix this?

Thanks