Limitations of Setfit Model

nv78 commented 1 year ago

Hi, was wondering your thoughts on some of the limitations of the Setfit model. Can it support any sort of few shot text classification, or what are some areas where this model falls short? Are there any research papers / ideas to address some of these limitations.

Also, is the model available to call via Hugging Face's inference API for enterprise. We saw the Ag-News endpoint, but are there any other endpoints that are more generalizable, or how would you recommend distilling a derivative of this model into production?

MosheWasserb commented 1 year ago

Some interesting research directions for SetFit:

Handle Imbalanced datasets
Add un-labeled data for semi-supervised learning
Other use cases like token classification (e.g. NER)
Two-sentence classification (e.g. NLI, MRPC)

oostopitre commented 1 year ago

Is there a way SETFIT supports multi-task learning? That is for every data point I want to compute loss that factors multiple output labels simultaneously. The column mapping makes expects only two fields (text and label). Makes me think this assuming of only one label per row is deeply baked in. Would love to know if there are any workarounds or options for multi-task learning.

tomaarsen commented 1 year ago

Hello @oostopitre,

I do not believe that multi-task learning is currently supported. However, I do believe that this may be possible, as SetFit relies on SentenceTransformers, for which multi-task learning is supported.

SentenceTransformer docs: https://www.sbert.net/docs/training/overview.html#multitask-training
A line showing how multi-task learning could be applied for a SentenceTransformer: https://github.com/UKPLab/sentence-transformers/blob/3e1929fddef16df94f8bc6e3b10598a98f46e62d/examples/training/other/training_multi-task.py#L105

In SetFit, we call the SentenceTransformer.fit method here: https://github.com/huggingface/setfit/blob/944e9e26ac1990db16e68a8a0ff48d8d2de8f9a3/src/setfit/trainer.py#L390-L397 And as you can see, we only provide one training objective (i.e. one dataloader and loss tuple).

If you are very interested in this topic, you may be able to modify this for your use case through e.g. a fork.

huggingface / setfit

Limitations of Setfit Model #209