intel / intel-xpu-backend-for-triton

OpenAI Triton backend for Intel® GPUs
MIT License
144 stars 44 forks source link

[Performance] Enable the SLPVectorizer to improve flash attention performance #2715

Open chengjunlu opened 1 week ago

chengjunlu commented 1 week ago

The SLPVectorizer + IGC_DisablePHIScalarizer improves the overall 3% performance on flash attention forward kernel.

Enable the SLPVectorizer before the IGCVectorizer could do the same transformation.