UKPLab / sentence-transformers

Multilingual Sentence & Image Embeddings with BERT
https://www.SBERT.net
Apache License 2.0
14.72k stars 2.43k forks source link

sentence-transformers vs. transformers for semantic similarity #1477

Open hanshupe opened 2 years ago

hanshupe commented 2 years ago

Based on ~5000 paper abstracts (mechanical engineering domain) I want to find the 100 most similar ones. After some research i found that there are some options:

Can someone clarify:

pritamdeka commented 2 years ago

Hi. I have worked with both the libraries and have found that SBERT is a much better choice for semantic similarity tasks. However for classification tasks the cross encoder architecture of BERT is better. For your second query you can easily fine tune any HF model using the sentence transformers library on your data. You can check the examples provided for different tasks. Based on your requirements if you have less data and need more data for fine tuning then you can also check the Augmented SBERT section.