Closed ArthurConmy closed 2 years ago
I believe yes they were trained with dropout.
You can see in the configs for the checkpoints: https://huggingface.co/stanford-crfm
If there are config files with dropout 0.0 I think that is for testing ... so we want to test that we get the same model so we set dropout to 0.0 for the test but the production models definitely have standard dropout.
Were the models trained with dropout?
Searching the repo, there's a config file where there's a dropout parameter 0.0 and a file with dropout parameter non-zero. I am confused and would love an answer. Thanks!