stickeritis / sticker

Succeeded by SyntaxDot: https://github.com/tensordot/syntaxdot
Other
25 stars 2 forks source link

Proposal: set default warmup steps to 2000 #169

Closed danieldk closed 4 years ago

danieldk commented 4 years ago

We use Adam in all the models. I propose to set the standard warmup to 2000 for both train and pretrain, since it results in better models in most cases than having no warmup.