microsoft / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
1.9k stars 345 forks source link

fix init issue for silently ignoring the deepspeed config #452

Closed xylian86 closed 1 month ago

xylian86 commented 1 month ago

This PR addresses the issue related to deepspeed-activation-checkpointing. Previously, when users want to use this feature, the corresponding configs, such as partition_activations and cpu_checkpointing, will be silently ignored. due to those config are not pass to the function.