AndriyMulyar / semantic-text-similarity

an easy-to-use interface to fine-tuned BERT models for computing semantic similarity in clinical and web text. that's it.
MIT License
215 stars 51 forks source link

Problem with space #11

Open saravanansaminathan opened 4 years ago

saravanansaminathan commented 4 years ago

When I try to find the similarity between statement it shows less similarity for the same words with and without space Code: web_model.predict([('crm plus','crmplus')]) Output : 0.8329288

But for some other words like web_model.predict([('iphone plus','iphoneplus')]) --> output: 3.4955034(high similarity)

Sheldon1999 commented 4 years ago

I don't know much about implementation. But maybe this is because it does not measure similarity on the basis of characters or spaces but it focuses on how meaning full the sentence(phrase) is. The first pair might have a less similar meaning than the second pair.