huggingface / nanotron

Minimalistic large language model 3D-parallelism training
Apache License 2.0
1.14k stars 107 forks source link

Helping making brrr depend on nanotron #15

Closed thomwolf closed 8 months ago

thomwolf commented 8 months ago

This pull request clean up brrr and make it depend on nanotron.

Pair PR to https://github.com/huggingface/brrr/pull/596