These changes are required to resolve the following errors when running bloom workload:
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes/conversion_utils_hip.h:270:12: error: use of undeclared identifier '__double2half'; did you mean '__double2hiint'?
return __double2half(val);
^~~~~~~~~~~~~
__double2hiint
/opt/rocm-5.4.0/include/hip/amd_detail/amd_device_functions.h:440:30: note: '__double2hiint' declared here
__device__ static inline int __double2hiint(double x) {
^
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/apply_rotary_pos_emb.hip:195:48: error: use of undeclared identifier '__shfl_sync'
auto q_rot_tmp = lane < half_dim ? __shfl_sync(mask[lane], q_rot, lane + half_dim)
3.
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes/reduction_utils_hip.h:278:43: error: excess elements in struct initializer
constexpr __half2_raw zero = {0x0000, 0x0000};
^~
These changes are required to resolve the following errors when running bloom workload:
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes/reduction_utils_hip.h:278:43: error: excess elements in struct initializer constexpr __half2_raw zero = {0x0000, 0x0000}; ^
~