intel / intel-xpu-backend-for-triton

OpenAI Triton backend for Intel® GPUs
MIT License
143 stars 44 forks source link

[OCL interface] Matrix 2D load intel_sub_group_2d_block_read_32b_8r8x2c is lowered to GenISA in-correctly. #1426

Closed chengjunlu closed 5 months ago

chengjunlu commented 5 months ago

The full signature of the OCL intel_sub_group_2d_block_read_32b_8r8x2c(void AS1*, int, int, int, int __vector(2), unsigned int*).

In sub-group-size=16 case, the expected to be lowered to: llvm.genx.GenISA.LSC2DBlockRead.v8i32 But it is lowered to: llvm.genx.GenISA.LSC2DBlockRead.v16i32(i64, i32, i32, i32, i32, i32, i32, i32, i32, i32, i1, i1, i32).

On Triton side, to use the GenISA instead of OCL. Need to track this on IGC side for fix.

whitneywhtsang commented 5 months ago

The problem is already reported and fixed in next agama driver.