The error (more details: https://github.com/intel/intel-xpu-backend-for-triton/issues/2644#issuecomment-2464373902) seems to be that the operation is incorrectly inserted into the block. My best guess is that we need to explicitly insert a barrier at the beginning of the thenBlock. However I don't know the exact reason why this code works for nvidia (maybe because of the different number of instructions that initially replace "gpu.barrier"() : () -> () however I'm not sure).
Closes #2644
The error (more details: https://github.com/intel/intel-xpu-backend-for-triton/issues/2644#issuecomment-2464373902) seems to be that the operation is incorrectly inserted into the block. My best guess is that we need to explicitly insert a barrier at the beginning of the
thenBlock
. However I don't know the exact reason why this code works for nvidia (maybe because of the different number of instructions that initially replace"gpu.barrier"() : () -> ()
however I'm not sure).