Enable fp8 2d block read with fp16 DPAS format

intel / intel-xpu-backend-for-triton

OpenAI Triton backend for Intel® GPUs

MIT License

100 stars 28 forks source link

Enable fp8 2d block read with fp16 DPAS format #1499

Open hwnam831 opened 6 days ago

hwnam831 commented 6 days ago

This PR fixes #1442 Because fp8 tensors are promoted to fp16 for enabling DPAS, there was a format mismatch. It is fixed by 1) matching the format when lowering 2d block load to LLVM and 2) enabling 8-bit 2d block primitive that supports different shape format.