Adding multiple heads - Githubissues

huggingface / setfit

Efficient few-shot learning with Sentence Transformers

Apache License 2.0

2.22k stars 220 forks source link

Is your goal to produce a text classification model or an embedding model? If the latter, then you might not want to use SetFit. However, if you want to use text classification using the text classification datasets from MTEB, then you could try to combine them into one larger dataset (e.g. with concatenate_datasets), and using that. Do note however that (too) large datasets don't strictly improve the model performance.

So, in short, calling train once with a combined training dataset will work better than calling it X times with X different training datasets.

Tom Aarsen

huggingface / setfit

Adding multiple heads #427