fine tuning models for semantic search

UKPLab / sentence-transformers

Multilingual Sentence & Image Embeddings with BERT

https://www.SBERT.net

Apache License 2.0

14.37k stars 2.39k forks source link

fine tuning models for semantic search #437

Open mchari opened 3 years ago

mchari commented 3 years ago

I tried using SentenceTransformer with a couple of models including roberta-base-nli-stsb-mean-tokens to do semantic search(cosine similarity), but I didn't get encouraging results. I wanted to see if it helps to fine tune the models on my own documents. Just wanted to confirm if the training example in sentence-transformers/examples/training/sts/training_stsbenchmark_continue_training.py would be the recommended approach ?

Any other suggestions will be appreciated.

nreimers commented 3 years ago

Yes, that is one way. In general, see the docs on examples on training: https://www.sbert.net/docs/training/overview.html

himanshusingh47 commented 3 years ago

Hi, I tried fine tuning roberta-base-nli-stsb-mean-tokens with my own dataset for semantic search as discussed in sentence-transformers/examples/training/sts/training_stsbenchmark_continue_training.py. But the results became worse than before tuning . Now it is showing almost the same embedding for every sentence passed. I don't know what went wrong. Also, does fine tuning updates all the parameters of the model? How would I freeze some layers and train my dataset on only a few layers so that the initial weights are preserved? How would I train the model with my dataset if it contains text data and no labelled similarity between them? I am new to this field. Any help would be appreciated.

mchari commented 3 years ago

On a related note, just wanted to provide a data point. Of all the pre-trained models that I have tried for semantic search, distilbert-multilingual-nli-stsb-quora-ranking gives the most meaningful results. Is there any plan to release other bert variations(bert,roberta, etc..) for this purpose ? Thanks!

nreimers commented 3 years ago

Hi @mchari I think the differences for BERT / RoBERTa / DistilBERT etc. are rather small. So currently there is no plan to provide other *-nli-stsb-quora-ranking models.

However, there is a plan to release various new models trained on other datasets with different targets / purposes.