laekov / fastmoe

A fast MoE impl for PyTorch
https://fastmoe.ai
Apache License 2.0
1.57k stars 189 forks source link

[BUG FIX] Fix bugs in stream manager. #172

Closed zms1999 closed 1 year ago

zms1999 commented 1 year ago

Stream manager is used in both computation (e.g., parallel linear) and communication (e.g., smart scheduling's nccl send/recv) kernels. To avoid unnecessary synchronization, developers should carefully check which stream is in use.

zms1999 commented 1 year ago

Thanks @chenyu-jiang for pointing out this problem in issue #168.