issues
search
huggingface
/
nanotron
Minimalistic large language model 3D-parallelism training
Apache License 2.0
1.23k
stars
122
forks
source link
[Fix] Assert the wrong tolerance of FA2's Layer Norm kernel
#81
Closed
xrsrke
closed
8 months ago