There is a new distributed system FasterMoE: modeling and optimizing training of large-scale dynamic pre-trained models published on PPoPP'22. Please kindly consider including this paper in your list.
FYI, we have also included your MoE systems and paper collections on FastMoE's homepage
There is a new distributed system FasterMoE: modeling and optimizing training of large-scale dynamic pre-trained models published on PPoPP'22. Please kindly consider including this paper in your list.
FYI, we have also included your MoE systems and paper collections on FastMoE's homepage