Closed rgiduthuri-intel closed 2 months ago
This issue is caused by environment problems. In order to run Triton end-to-end tests a new version of IPEX is required. Currently we build IPEX and PyTorch from source as part of the Triton build (See scripts/compile-pytorch-ipex.sh
). That script will build and install PyTorch and IPEX source code and build them. Note that one has to uninstall existing version (pip uninstall ...) or the script will not replace them with the newly built version.
In my environment I have:
I can run the tutorial correctly (using latest Triton commit).
@rgiduthuri-intel can you please give it a try and report back on whether it solves the problem you are facing please ?
@rgiduthuri-intel, we also build torch, ipex, and triton wheels nightly and attach them as artifacts to the corresponding job. This is the latest run, for example: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/8334844255. You can download the artifact for your Python version and install all wheels with pip install *.whl
. Let us know if that works for you.
@etiotto: now I'm able to run with a clean build using both scripts/compile-triton.sh --env
and scripts/compile-pytorch-ipex.sh
. @pbchekin I will try downloading .whl going forward. Thank you.
Now that the build problem is resolved, I'm back to crash with 03-matrix-multiplication.py
that I mentioned in the original message with an older build. Appreciate your help.
$ python 03-matrix-multiplication.py
L0 build module failed. Log:
error: total scratch space exceeds HW supported limit for kernel matmul_kernel_0d1d2d3d4d5d6d7c8d9c10d11c: 338112 bytes (max permitted PTSS 262144 bytes)
error: backend compiler failed build.
LIBXSMM_VERSION: main_stable-1.17-3651 (25693763)LIBXSMM WARNING: AMX state allocation in the OS failed!
LIBXSMM_TARGET: clx [Genuine Intel(R) CPU 0000%@]
Registry and code: 13 MB
Command: python 03-matrix-multiplication.py
Uptime: 15.185146 s
Segmentation fault (core dumped)
@rgiduthuri-intel so using the latest Triton version and the pinned version of pytorch and IPEX you are able to run the tutorial. Correct ?
I do not know how to reproduce the issue you have with the older Triton build. In general we aim to fix problems that are reproducible using the latest version of Triton. Fixing older builds is not something we can do.
Is possible your older binary fails to run because the IPEX (or PyTorch) version is not the one required. The interface between pytorch/IPEX and Triton has to be in sync.
@pbchekin do we have older version of Triton saved up somewhere ?
@rgiduthuri-intel, do you have the same issue with the latest pytorch, ipex, triton wheels? Looking to the error message:
total scratch space exceeds HW supported limit
I would say there is a hardware limitation. What GPU do you have?
It is a known issue, you can comment out the problematic configuration, like how it is done in https://github.com/intel/intel-xpu-backend-for-triton/blob/llvm-target/python/tutorials/03-matrix-multiplication.py#L167.
@@ -163,8 +164,9 @@ import triton.language as tl
# provided configs
@triton.autotune(
configs=[
- triton.Config({'BLOCK_SIZE_M': 128, 'BLOCK_SIZE_N': 256, 'BLOCK_SIZE_K': 64, 'GROUP_SIZE_M': 8}, num_stages=3,
- num_warps=8),
+ # FIXME: Once tl.dot uses DPAS put back the workload commented out.
+ # triton.Config({'BLOCK_SIZE_M': 128, 'BLOCK_SIZE_N': 256, 'BLOCK_SIZE_K': 64, 'GROUP_SIZE_M': 8}, num_stages=3,
+ # num_warps=8),
@rgiduthuri-intel can we close this ticket?
Please close it if it's taken care of. Thanks
I'm not able to run Triton GEMM tutorial example using intel-xpu-backend-for-triton. Below are the local changes done to "triton/python/tutorials/03-matrix-multiplication.py" to use XPU. Please scroll down to see the error messages from the latest commit as well as few week's old. Strangely different kind of errors.
Appreciate a quick workaround if possible. Thanks.
with intel-xpu-backend-for-triton latest commit 0469c4053cfefe7957f2127f7b41900baf934c4a
with intel-xpu-backend-for-triton commit from few weeks back (don't have the exact commit)