huggingface / setfit

Efficient few-shot learning with Sentence Transformers
https://hf.co/docs/setfit
Apache License 2.0
2.16k stars 217 forks source link

Dealing with imbalanced classes #407

Open Matthieu-Tinycoaching opened 1 year ago

Matthieu-Tinycoaching commented 1 year ago

Hi,

I have 9 classes which are imbalanced. The smallest one has 42 examples and the tallest one 95.

I cannot downsample the bigger classes since I need all these training data. Is there any way to add class weights during SetFit training?

kgourgou commented 1 year ago

Hello!

You may just need to switch to a SetFit head with a balanced loss; see for example: https://github.com/huggingface/setfit/issues/375

Specifically:

model_body=SentenceTransformer("your-model-name")
model_head=LogisticRegression(class_weight="balanced")
model = SetFitModel(model_body=model_body, model_head=model_head)