Open PhilipMay opened 2 years ago
Hi @PhilipMay, Sorry for the late reply. That's an interesting idea!
I tried adding a dropout layer, and unfortunately, I saw the performance dropped significantly with 0.05
dropout rate (I only tried emotion
dataset and the performance dropped from 48.XX to 2X.XX ~ 3X.XX). I also tried adding a layer norm as well, but the results were the same.
So I think maybe these two are not suitable for few-shot learning. By the way, here is the command I used for testing:
python scripts/setfit/run_fewshot.py --classifier pytorch --keep_body_frozen --lr 0.01 --is_test_set true
I'm personally in favor of refactoring to an implementation where it's relatively simple for users to provide their own heads, after which these kinds of head features (dropout, pooling, multiple layers, etc.) can all be implemented to the likings of the user.
Hi, maybe a dropout layer in the Dense SetFitHead would be nice to have. Implementation could be like in the Transformers
BertForSequenceClassification
head:https://github.com/huggingface/transformers/blob/d447c460b16626c656e4d7a9425f648fe69517b3/src/transformers/models/bert/modeling_bert.py#L1506-L1517
What do you think? @blakechi @lewtun