Open caprone opened 4 years ago
Hi @caprone Why would you perform classification with sentence-transformers / sentence-bert?
This repository is designed to create dense vector representations for sentence, that can then for example be used for clustering.
For classification tasks, I recommend to use the original BERT / RoBERTa models. They were designed for this.
Best Nils Reimers
What if we want to fine-tune the sentence representations to a new domain using a classification task and then use the sentence representations for clustering?
Basically, I'm wondering how I can do domain adaptation on this model? I thought I could maybe fine-tune the model on a binary classification task?
Yes, you can fine-tune the model with a classification task (which is commonly done for example with NLI dataset). If this yields good representations for clustering depends on the task, so there is sadly no magic that ensures that this fine-tuning yields good representations. You would need to test it and see if it works.
@nreimers Can I fine-tune it with a binary prediction task? I want to use the model on a different domain and I don't have NLI style data from that domain.
@hockeybro12 Yes, you can do that.
So I would use the softmax loss in this case right? I don't have two sentences, just one sentence with one label. @nreimers
You would need to write a new loss model / function, that takes you one sentence an applies softmax. The implemented softmax loss expects to have two sentences as input
is it possible to use the outputs from SentenceTransformer.encode() as input to a classification model? Part of my data is a text column which I was looking to embed and then pass this plus some one-hot-encoded categorical data as input into a multi-class classification model
Yes, that is possible.
SentenceTransformer is a regular pytorch model, so you could even build your classifier on top of it.
Otherwise, you can pre-process the data and map all text data first to a vector, which is then passed to you classifier.
However, the output of encode()
is a list of numpy arrays which does not support the gradient operation of PyTorch. Can I simply "forward" a sentence and get a representation for which I can compute gradient using PyTorch?
@swyoon it may be helpful https://github.com/UKPLab/sentence-transformers/issues/255#issuecomment-638010977
HI and thanks for Great library ;))
I see that it is not ready for tuning on binary or multilabels task without manually add a linear layer on top of model and implement the binary cross entropy loss,... right? or i'm in wrong?