BorealisAI / DT-Fixup

Optimizing Deeper Transformers on Small Datasets https://arxiv.org/abs/2012.15355
15 stars 10 forks source link