matlab-deep-learning / transformer-models

Deep Learning Transformer models in MATLAB
Other
206 stars 61 forks source link

GPT-2 doesn't include dropout layers #18

Open bwdGitHub opened 2 years ago

bwdGitHub commented 2 years ago

We would like to use these issues to gauge user interest.

The GPT-2 implementation does not include dropout layers. This would be useful for further pre-training and fine-tuning workflows to prevent overfitting.