fine tuning for simple binary classification task??

UKPLab / sentence-transformers

State-of-the-Art Text Embeddings

https://www.sbert.net

Apache License 2.0

14.78k stars 2.43k forks source link

fine tuning for simple binary classification task?? #90

Open caprone opened 4 years ago

caprone commented 4 years ago

HI and thanks for Great library ;))

I see that it is not ready for tuning on binary or multilabels task without manually add a linear layer on top of model and implement the binary cross entropy loss,... right? or i'm in wrong?

nreimers commented 4 years ago

Hi @caprone Why would you perform classification with sentence-transformers / sentence-bert?

This repository is designed to create dense vector representations for sentence, that can then for example be used for clustering.

For classification tasks, I recommend to use the original BERT / RoBERTa models. They were designed for this.

Best Nils Reimers

hockeybro12 commented 4 years ago

What if we want to fine-tune the sentence representations to a new domain using a classification task and then use the sentence representations for clustering?

Basically, I'm wondering how I can do domain adaptation on this model? I thought I could maybe fine-tune the model on a binary classification task?

nreimers commented 4 years ago

Yes, you can fine-tune the model with a classification task (which is commonly done for example with NLI dataset). If this yields good representations for clustering depends on the task, so there is sadly no magic that ensures that this fine-tuning yields good representations. You would need to test it and see if it works.

hockeybro12 commented 4 years ago

@nreimers Can I fine-tune it with a binary prediction task? I want to use the model on a different domain and I don't have NLI style data from that domain.

nreimers commented 4 years ago

@hockeybro12 Yes, you can do that.

hockeybro12 commented 4 years ago

So I would use the softmax loss in this case right? I don't have two sentences, just one sentence with one label. @nreimers

nreimers commented 4 years ago

You would need to write a new loss model / function, that takes you one sentence an applies softmax. The implemented softmax loss expects to have two sentences as input

csmyth93 commented 4 years ago

is it possible to use the outputs from SentenceTransformer.encode() as input to a classification model? Part of my data is a text column which I was looking to embed and then pass this plus some one-hot-encoded categorical data as input into a multi-class classification model

nreimers commented 4 years ago

Yes, that is possible.

SentenceTransformer is a regular pytorch model, so you could even build your classifier on top of it.

Otherwise, you can pre-process the data and map all text data first to a vector, which is then passed to you classifier.

swyoon commented 4 years ago

However, the output of encode() is a list of numpy arrays which does not support the gradient operation of PyTorch. Can I simply "forward" a sentence and get a representation for which I can compute gradient using PyTorch?

madstuntman11 commented 4 years ago

@swyoon it may be helpful https://github.com/UKPLab/sentence-transformers/issues/255#issuecomment-638010977