Open arita37 opened 2 years ago
Yes, that is possible.
You encode your text to embeddings and than train one of the classifiers from e.g. SKLearn
Thanks.
How to fine tune the base sentence encoder using a classification head ? (or another loss, like triplet loss)
Having mutiple heads to plug in for fine tuning would be very beneficial.
Thanks
On Dec 17, 2021, at 19:23, Nils Reimers @.***> wrote:
Yes, that is possible.
You encode your text to embeddings and than train one of the classifiers from e.g. SKLearn
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.
Hello,
Coming back to this topic, Is there a way to add a classifier head (cross entropy) on top of sentence transform,
to perform classification fine tuning ?
If not available, to do a pull request ?
Thanks
Usually it is recommend to just use the cross-encoder. It is specifically designed for that and works in nearly all cases better than a SentenceTransformer + classification head. The cross-encoder updates all weights of the model when you do back prop.
Otherwise here an example to add a LR classifier on top of an embedding model https://towardsdatascience.com/sentence-transformer-fine-tuning-setfit-outperforms-gpt-3-on-few-shot-text-classification-while-d9a3788f0b4e?gi=58badbe8fcd7
Thansk.
But, cross-encoder have 2 sentences as input, am only interested in fine tuning using classification due to my already labeled dataset. (sentence i , label i)
Not interested in the classification result itself, but fine tuning the embedding using a classifier, and prefer keeping one single model framework.
In general, having the ability to plug different heads would be beneficial.
Thanks
On Dec 21, 2021, at 22:19, Nils Reimers @.***> wrote:
Usually it is recommend to just use the cross-encoder. It is specifically designed for that and works in nearly all cases better than a SentenceTransformer + classification head. The cross-encoder updates all weights of the model when you do back prop.
Otherwise here an example to add a LR classifier on top of an embedding model https://towardsdatascience.com/sentence-transformer-fine-tuning-setfit-outperforms-gpt-3-on-few-shot-text-classification-while-d9a3788f0b4e?gi=58badbe8fcd7
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.
Cross-Encoder can also be used with single sentence inputs. Just leave the second empty.
For fine-tuning an embedding model with a classifier: There is the SoftmaxLoss. It adds a softmax classifier on top. But results are not that good when you are interested in the embeddings, as there is no theoretical motivation why this should produce good embeddings.
Otherwise have a look at SetFit and BatchHardTripletLoss
Thx. On Cross Entropy and Metric Loss es, there is this good paper:
https://arxiv.org/pdf/2003.08983.pdf
Seems cross entropy cam perform well too
Similar issue: SentenceTransformer model encodes a sentence and gives out embedding/vector. I wish to fine tune this model using my own classification training data. Once fine tuned model is ready I wish to use it for encoding sentence to give out embedding/vector. Possible? Basically somehow I wish to update or fine tune the SentenceTransformer model with my domain corpus. It can be with classification training data or simply just texts. Anything is ok. Possible?
Hi,
Just wondering is this possible to add this simple task on top of Sen-Trans
Could not find pre-built task on single sentence.
thanks