/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/apply_rotary_pos_emb.hip:195:48: error: use of undeclared identifier 'shfl_sync'
auto q_rot_tmp = lane < half_dim ? shfl_sync(mask[lane], q_rot, lane + half_dim)
^
Merging this PR, although there's a concern that the workaround might not be functionally correct. Will analyze correctness and update in a follow-up PR if required.
Error:
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/apply_rotary_pos_emb.hip:195:48: error: use of undeclared identifier 'shfl_sync' auto q_rot_tmp = lane < half_dim ? shfl_sync(mask[lane], q_rot, lane + half_dim) ^