Open miaomiaoma0703 opened 2 months ago
How can I measure the all-to-all communication time in the MoE model during MoE models like Qwen1.5-MoE-A2.7B inference via DeepSpeed?
How can I measure the all-to-all communication time in the MoE model during MoE models like Qwen1.5-MoE-A2.7B inference via DeepSpeed?