intel / intel-xpu-backend-for-triton

OpenAI Triton backend for Intel® GPUs
MIT License
126 stars 36 forks source link

[Feature Improvement] Change large GRF warnings to trigger by debug flag #2251

Open Stonepia opened 1 week ago

Stonepia commented 1 week ago

PR https://github.com/intel/intel-xpu-backend-for-triton/pull/1654 has been introduced using large GRF mode automatically.

Could we make the cout in these lines be triggered by a debug-only flag so that normal users could safely ignore this? Those should be treated as warnings in my opinion.

https://github.com/intel/intel-xpu-backend-for-triton/blob/614efe26adeac8e28fe27c6bbfa7840bbc43ec90/third_party/intel/backend/driver.c#L188-L201

xpu  train AlbertForQuestionAnswering         

// We wish those won't exposed to the normal user
(I): Detected 9472 spills, recompiling the kernel using large GRF mode
(I): Kernel has now 512 spills
(I): Detected 20032 spills, recompiling the kernel using large GRF mode
(I): Kernel has now 10816 spills
(I): Detected 33600 spills, recompiling the kernel using large GRF mode
(I): Kernel has now 25408 spills
Stonepia commented 1 week ago

I would like to explain my concern as to why we don't set the grf_mode=auto in the triton config from the inductor side.

The reason is that, from the PyTorch inductor side, we are currently trying to keep the same config with CUDA/HIP, so that we would avoid possible unalignments. Large GRF mode for compiling kernels is an optimization for XPU only, thus we would like to hide the complexity from the users familiar with CUDA.

BTW, I am not quite familiar with the differences between different grf_modes. So if there are any concerns, please point out and let's have a discussion.