microsoft / Tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation
MIT License
724 stars 93 forks source link

[Question] Why use datatype ncclInt8 in nccl_all_to_all_scatter_async. #220

Open cicirori opened 10 months ago

cicirori commented 10 months ago

Wondering why the ncclint8 datatype is used in the C++ implementation of nccl_all_to_all_scatter_async, whether it's for speed reasons or simply because don't want to support multiple datatypes through templates.

Thanks!

ghostplant commented 4 months ago

According to bandwidth profiling, there is no speed difference between ncclInt8 x N and ncclInt32 x N / 4, so you can choose either.