issues
search
EleutherAI
/
gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
https://www.eleuther.ai/
Apache License 2.0
6.96k
stars
1.02k
forks
source link
Fix weight decay module check
#1274
Closed
aurelion-source
closed
2 months ago
aurelion-source
commented
2 months ago
Updates is_no_weight_decay_module to check for module name.
Adds checks for MixedFused (Apex) layers.
Remove redundant TE imports.