intel / intel-xpu-backend-for-triton

OpenAI Triton backend for Intel® GPUs
MIT License
143 stars 44 forks source link

Investigate the new tensor descriptor API #2586

Open mfrancepillois opened 3 weeks ago

mfrancepillois commented 3 weeks ago

OpenAI has improved the way structured memory access is handled.

The PR : https://github.com/triton-lang/triton/pull/4916 cleans-up, extends the triton dialect with new operations and improves the way TMA descriptors are handled by triton.

As significant changes result from this PR, we should investigate if and how these memory accesses using tensor descriptors could be transformed into block pointers.

mfrancepillois commented 3 weeks ago

The information needed to create a block pointer are:

The new proposal for Tensor descriptor allows users to create a tensor descriptor on the device, which will be lowered into a TMA descriptor. This PR extends the Triton Dialect with a new operation MakeTensorDescOp. This operation contains the following information:

Possible alternative: keep the MakeTensorDescOp as it is in the pipeline and implement only our XPU-specific lowering that should use 2D block operations.