microsoft / tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation
MIT License
723 stars 93 forks source link

Fix ROCm hipify issue for backward compatibility #116

Closed abuccts closed 2 years ago

abuccts commented 2 years ago

Current ROCm 5.0 versions have the following issue to hipify cuOccupancyMaxPotentialBlockSize, force change to hipModuleOccupancyMaxPotentialBlockSize and remove extra argument manually.

/root/tutel/tutel/custom/custom_kernel_hip.cpp: In function ‘void init_nccl(const at::Tensor&, int, int, int)’:
/root/tutel/tutel/custom/custom_kernel_hip.cpp:364:126: error: too many arguments to function ‘hipError_t hipModuleOccupancyMaxPotentialBlockSize(int*, int*, hipFunction_t, size_t, int)’
     CHECK_EQ(0, hipModuleOccupancyMaxPotentialBlockSize(&mem_stride_copy_gridsize, &mem_stride_copy_blocksize, hfunc, 0, 0, 0));
                                                                                                                              ^

Reference: