Open fhamborg opened 2 years ago
Hi @fhamborg thanks for your interest in our work!
There's two main hyperparameters that we found important for SetFit:
num_iterations
: The number of "positive" and "negative" text pairs to generate for contrastive learning. See #85 for more details on this.num_epochs
: The number of epochs to use for contrastive learning.In our experiments, we found num_iterations=20
and num_epochs=1
worked quite well in general. When num_iterations
was too low (e.g. 5), we saw the average performance drop by ~2-3 percentage points.
PS apologies if there was some confusion about num_epochs
vs num_iterations
: there were some typos in the blog post that should hopefully now be corrected!
Thanks a lot for the clarification! Looking at the code, isn't invoking the sentence pair generation methods num_iterations times the same as invoking it just once and then repeating the resulting list of pairs num_iterations times?
Hi there! I was wondering whether you can provide a range for typically "good" values to use/test for the argument num_epochs both in the single label classification case and the multi label classification case. Of course, the best performing number depends on the classes to be predicted and the dataset, but in non-FSL settings, typically one uses a range between 2-5 (whereas many researchers may also stick to common defaults such as 3). I'm asking because I noticed that you use rather
num_epochs = 20
in your example scripts, so perhaps in general in setfit num_epochs should be higher than in non-FSL settings?