Pcard-76459
Support MoE expert parallelism in dygraph auto parallel. In auto-parallel expert parallelism, experts' weights have different process meshes. This pr implements the expert parallelism as following:
Main changes
Add two apis local_tensor_list_from_dtensor and dtensor_from_local_list to transform the tensors between global and local meshes.
Fix the problems when the input tensors of a op have different mesh, which is necessary in expert parallelism.
你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.
PR Category
Auto Parallel
PR Types
New features
Description
Pcard-76459 Support MoE expert parallelism in dygraph auto parallel. In auto-parallel expert parallelism, experts' weights have different process meshes. This pr implements the expert parallelism as following:
Main changes
local_tensor_list_from_dtensor
anddtensor_from_local_list
to transform the tensors between global and local meshes.