issues
search
huggingface
/
nanotron
Minimalistic large language model 3D-parallelism training
Apache License 2.0
1.14k
stars
107
forks
source link
[Bug] Fix missing `.get_named_params_without_weight_decay()` in llama
#148
Closed
xrsrke
closed
5 months ago
3outeille
commented
5 months ago
lgtm
lgtm