huggingface / setfit

Efficient few-shot learning with Sentence Transformers
https://hf.co/docs/setfit
Apache License 2.0
2.22k stars 220 forks source link

Limitations of Setfit Model #209

Open nv78 opened 1 year ago

nv78 commented 1 year ago

Hi, was wondering your thoughts on some of the limitations of the Setfit model. Can it support any sort of few shot text classification, or what are some areas where this model falls short? Are there any research papers / ideas to address some of these limitations.

Also, is the model available to call via Hugging Face's inference API for enterprise. We saw the Ag-News endpoint, but are there any other endpoints that are more generalizable, or how would you recommend distilling a derivative of this model into production?

MosheWasserb commented 1 year ago

Some interesting research directions for SetFit:

oostopitre commented 1 year ago

Is there a way SETFIT supports multi-task learning? That is for every data point I want to compute loss that factors multiple output labels simultaneously. The column mapping makes expects only two fields (text and label). Makes me think this assuming of only one label per row is deeply baked in. Would love to know if there are any workarounds or options for multi-task learning.

tomaarsen commented 1 year ago

Hello @oostopitre,

I do not believe that multi-task learning is currently supported. However, I do believe that this may be possible, as SetFit relies on SentenceTransformers, for which multi-task learning is supported.

In SetFit, we call the SentenceTransformer.fit method here: https://github.com/huggingface/setfit/blob/944e9e26ac1990db16e68a8a0ff48d8d2de8f9a3/src/setfit/trainer.py#L390-L397 And as you can see, we only provide one training objective (i.e. one dataloader and loss tuple).

If you are very interested in this topic, you may be able to modify this for your use case through e.g. a fork.