issues
search
huggingface
/
nanotron
Minimalistic large language model 3D-parallelism training
Apache License 2.0
1.14k
stars
107
forks
source link
Some sanity fix for "PR [Feature] Topology-agnostic optimizer states loading"
#29
Closed
xrsrke
closed
8 months ago