issues
search
huggingface
/
nanotron
Minimalistic large language model 3D-parallelism training
Apache License 2.0
1.14k
stars
107
forks
source link
Integrating ScatterMoE
#104
Open
shawntan
opened
6 months ago
shawntan
commented
6 months ago
added option to use ScatterMoE
ScatterMoE option asserts that EP = 1 (ScatterMoE doesn't support expert parallel)