epfLLM / Megatron-LLM

distributed trainer for LLMs
Other
504 stars 73 forks source link

Any plans to rebase the codebase to most recent Megatron-LM for MoE? #100

Open xingyaoww opened 4 months ago

xingyaoww commented 4 months ago

It looks like the latest Megatron-LM already supports Mixture-of-Experts -- I'd be happy to see that supported (for Mixtral)! I can also help contribute but don't really have too much experience in rebasing Megatron-LLM to upstream..