huggingface / setfit

Efficient few-shot learning with Sentence Transformers
https://hf.co/docs/setfit
Apache License 2.0
2.24k stars 223 forks source link

Mis-alignment between Sentence Embeddings and Classifier in multi-label classification ? #500

Open ycouble opened 8 months ago

ycouble commented 8 months ago

Hello,

(Cross posting this between SetFit and sentence-transformers)

We're investigating the possibility to use SetFit for customer service message classification.

Our case is a multi-label case since often the customers have more than one request in each message. During the training phase of SetFit, the texts and labels are passed to Sentence Transformers' SentenceLabelDataset. The contrastive examples are created based on the combination of labels, not on the intersection of labels, e.g. Labels [1, 1, 0] and [1, 0, 0] are going to be separated by contrastive learning, and only pairs of [1, 1, 0] will be gathered by the contrastive learning phase.

This can be somewhat counter productive in SetFit since with, for example, the classifier "one-vs-rest" which would require examples with one common label to be close to each other.

We were wondering if that behaviour was deliberatelly chosen this way and why ? Would you have experience dealing with this type of data and used a workaround ? Would you be interested in a contribution to allow this type of use-case ?

Cheers,