Open zachschillaci27 opened 1 year ago
Bumping this. I have the same question.
I'd figure this might improve accuracy on classification tasks as well?
First domain adapt with unsupervised data. Then fit it to the specific task with labeled data, in a very sample efficient way thanks to SentFit.
Wouldn't this yield even better results for all tasks?
Thanks.
When using
SetFit
for classification in a more technical domain, I could imagine the generically-trainedSBERT
models may produce poor sentence embeddings if the domain is not represented well enough in the diverse training corpus. In this case, would it be advantageous to first apply domain adaptation techniques (as discussed here) to anSBERT
model before using the model as a base inSetFit
? Have you considered and/or tested such an approach?Thanks for the help!