microsoft / tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation
MIT License
694 stars 84 forks source link

[Question] Why use datatype ncclInt8 in nccl_all_to_all_scatter_async. #220

Open cicirori opened 8 months ago

cicirori commented 8 months ago

Wondering why the ncclint8 datatype is used in the C++ implementation of nccl_all_to_all_scatter_async, whether it's for speed reasons or simply because don't want to support multiple datatypes through templates.

Thanks!

ghostplant commented 2 months ago

According to bandwidth profiling, there is no speed difference between ncclInt8 x N and ncclInt32 x N / 4, so you can choose either.