training-transformers-together / training-transformers-together.github.io

Contents of the main NeurIPS 2021 demo page
MIT License
2 stars 0 forks source link

Internal experiments #12

Open justheuristic opened 2 years ago

justheuristic commented 2 years ago

The purpose is to ensure that training is stable. Train through warmup and see which works best by then.

If worst comes to worst, use CollaborativeOptimizer