HPDL-Group / Merak

Apache License 2.0
69 stars 9 forks source link

Merak is better? #10

Closed Lvjinhong closed 9 months ago

Lvjinhong commented 9 months ago

I am very grateful to see your work and have the following question to consult: If I plan to pre-train a language model of about 5 billion parameters, would you recommend using Merak? Is Merak mature enough to be better than MegatronLM (since I was planning to use MegatronLM previously)?

lucasleesw commented 9 months ago

Thanks for your interest. We think Merak is better than Megatron in terms of parallel training runtime. But admittedly, Merak is not equipped with some techniques in Megatron such as OP fusion, which can bring performance benefits.