Requesting hyperparameter details

marcotcr / checklist

Beyond Accuracy: Behavioral Testing of NLP models with CheckList

MIT License

2.01k stars 204 forks source link

This was a while ago, but I think these are the hyperparameters I specified (the rest were all hugginface defaults): --model_type roberta --model_name_or_path roberta-base --max_seq_length 128 --learning_rate 2e-5 --num_train_epochs 3.0

In any case, I wouldn't worry too much about it. These models are not well calibrated, it's normal that most predictions are super confident, and getting neutral predictions within a range is a hack, since these models are trained on binary rather than three way classification. We did it so we could compare research models to commercial models, but this is not what I would do if I actually wanted a sentiment model that predicted 'neutral'

marcotcr / checklist

Requesting hyperparameter details #96