huggingface / setfit

Efficient few-shot learning with Sentence Transformers
https://hf.co/docs/setfit
Apache License 2.0
2.25k stars 223 forks source link

num_epochs range #86

Open fhamborg opened 2 years ago

fhamborg commented 2 years ago

Hi there! I was wondering whether you can provide a range for typically "good" values to use/test for the argument num_epochs both in the single label classification case and the multi label classification case. Of course, the best performing number depends on the classes to be predicted and the dataset, but in non-FSL settings, typically one uses a range between 2-5 (whereas many researchers may also stick to common defaults such as 3). I'm asking because I noticed that you use rather num_epochs = 20 in your example scripts, so perhaps in general in setfit num_epochs should be higher than in non-FSL settings?

lewtun commented 2 years ago

Hi @fhamborg thanks for your interest in our work!

There's two main hyperparameters that we found important for SetFit:

In our experiments, we found num_iterations=20 and num_epochs=1 worked quite well in general. When num_iterations was too low (e.g. 5), we saw the average performance drop by ~2-3 percentage points.

PS apologies if there was some confusion about num_epochs vs num_iterations: there were some typos in the blog post that should hopefully now be corrected!

fhamborg commented 2 years ago

Thanks a lot for the clarification! Looking at the code, isn't invoking the sentence pair generation methods num_iterations times the same as invoking it just once and then repeating the resulting list of pairs num_iterations times?