Closed suiyoubi closed 1 month ago
Thanks, approved. Just please make sure that CI pipeline passes
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.
Hi @suiyoubi, do you think we could get this in? It just needs conflict resolution
Hmm I think that @akoumpa extended https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/nlp/models/language_modeling/megatron/gpt_layer_modelopt_spec.py to cover MoE.
So we can't really replace the spec unless we also enable MoE layer spec in https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/inference/gpt/model_specs.py
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.
This PR was closed because it has been inactive for 7 days since being marked as stale.
What does this PR do ?
Unified to use mcore's modelopt specs instead of the NeMo one