Closed whitneywhtsang closed 2 weeks ago
It is good news that we can remove the SLPVectorizer
. It is observed that the SLPVectorizer
at Triton side may change the order of the operation which make the IR to be more complicate to IGC for optimize.
Changes are LGTM.
IGCVectorizer
of driver agama 1032 has improved. This PR disables the LLVM post processing Triton performed by default, which includesSLPVectorizer
.By disabling LLVM post processing, the performance impact to the 3 key workloads (FA, GEMM, Softmax) are all positive. For example, GEMM out of box has improved by 16%.
CI: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/11664443045, https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/11674882693