UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
14.78k stars 2.43k forks source link

Fine-tune the model using Sentence Pair Classification with custom Data and not using GLUE data #125

Open ankush20m opened 4 years ago

ankush20m commented 4 years ago

Hello All,

I have my own domain-specific dataset containing sentence pairs with similarity scores like the Microsoft Paraphrase MRPC dataset. I want to fine-tune the language model using my custom data.

Could anyone tell how can I perform sentence pair classification task and fine-tune the model?

nreimers commented 4 years ago

Do you want to do pair classification or regression? In regression, you have given a score 0...1 and you try to estimate this continuous score using cosine similarity.

For classification see: https://github.com/UKPLab/sentence-transformers/blob/master/examples/training_nli_bert.py

For regression, see: https://github.com/UKPLab/sentence-transformers/blob/master/examples/training_stsbenchmark_bert.py

ankush20m commented 4 years ago

My task is Classification, where I have scores 0 and 1 i.e. are both sentences similar or not.

BTW Thanks @nreimers :) I will explore this.

prince14322 commented 4 years ago

Do you want to do pair classification or regression? In regression, you have given a score 0...1 and you try to estimate this continuous score using cosine similarity.

For classification see: https://github.com/UKPLab/sentence-transformers/blob/master/examples/training_nli_bert.py

For regression, see: https://github.com/UKPLab/sentence-transformers/blob/master/examples/training_stsbenchmark_bert.py

links are not working. Can you please give new links for fine tuning examples?

nreimers commented 4 years ago

@prince14322 You can find all examples here: https://github.com/UKPLab/sentence-transformers/tree/master/examples

Yuanlu1225 commented 2 years ago

@prince14322 你可以在这里找到所有的例子:https : //github.com/UKPLab/sentence-transformers/tree/master/examples

I can't find the classification task, could you tell me which file it is? The training_nli_bert.py is missing....