Open chengjunlu opened 1 year ago
The kernel loaded without error on integrated graphics:
root device count: 1
compile kernel on device: Intel(R) Iris(R) Xe Graphics
triton__0d1d2d3d4d5d6d7d8d9d10d11d12d13d14d15d16d17d18d19d20d21d22d23d24d25d26d27d28d29d30d31d32d33d34d35d36d37d38d39d40d41d42d43d44d45d46d47d48d49d50d51d52d53d54d55d56d57d58d59d60d61d62d63d64d65d66d67d68d69d70d71d72d73d74d75d76d77d78d79d80d81d82d83d84d85d86d
create kernel:triton__0d1d2d3d4d5d6d7d8d9d10d11d12d13d14d15d16d17d18d19d20d21d22d23d24d25d26d27d28d29d30d31d32d33d34d35d36d37d38d39d40d41d42d43d44d45d46d47d48d49d50d51d52d53d54d55d56d57d58d59d60d61d62d63d64d65d66d67d68d69d70d71d72d73d74d75d76d77d78d79d80d81d82d83d84d85d86d
compiled kernel ptr: 0x4dc1cd0
total kernels:1
kernel:triton__0d1d2d3d4d5d6d7d8d9d10d11d12d13d14d15d16d17d18d19d20d21d22d23d24d25d26d27d28d29d30d31d32d33d34d35d36d37d38d39d40d41d42d43d44d45d46d47d48d49d50d51d52d53d54d55d56d57d58d59d60d61d62d63d64d65d66d67d68d69d70d71d72d73d74d75d76d77d78d79d80d81d82d83d84d85d86d @0x4dc1cd0
My configuration
silee2@silee2-mobl:~/Projects/frameworks.ai.pytorch.ipex-gpu/build [chengjun/test_dpcpp|⚑ 3]$ apt list level-zero
Listing... Done
level-zero/now 1.11.0 amd64 [installed,local]
silee2@silee2-mobl:~/Projects/frameworks.ai.pytorch.ipex-gpu/build [chengjun/test_dpcpp|⚑ 3]$ dpcpp --version
icpx: warning: use of 'dpcpp' is deprecated and will be removed in a future release. Use 'icpx -fsycl' [-Wdeprecated]
Intel(R) oneAPI DPC++/C++ Compiler 2023.1.0 (2023.1.0.20230320)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/silee2/intel/oneapi/compiler/2023.1.0/linux/bin-llvm
Configuration file: /home/silee2/intel/oneapi/compiler/2023.1.0/linux/bin-llvm/../bin/icpx.cfg
silee2@silee2-mobl:~/Projects/frameworks.ai.pytorch.ipex-gpu/build [chengjun/test_dpcpp|⚑ 3]$ apt list intel-igc*
Listing... Done
intel-igc-core/now 1.0.14062.11 amd64 [installed,local]
intel-igc-opencl/now 1.0.14062.11 amd64 [installed,local]
iGPU is from i5 11300H [(https://www.intel.com/content/www/us/en/products/sku/196656/intel-core-i511300h-processor-8m-cache-up-to-4-40-ghz-with-ipu/specifications.html)]
The case failed on both ATSM and iGPU on Alderlake.
root device count: 2
compile kernel on device: Intel(R) UHD Graphics 770
create kernel:triton__0d1d2d3d4d5d6d7d8d9d10d11d12d13d14d15d16d17d18d19d20d21d22d23d24d25d26d27d28d29d30d31d32d33d34d35d36d37d38d39d40d41d42d43d44d45d46d47d48d49d50d51d52d53d54d55d56d57d58d59d60d61d62d63d64d65d66d67d68d69d70d71d72d73d74d75d76d77d78d79d80d81d82d83d84d85d86d
L0 API error code:78000011
Here is my configuration:
ii intel-fw-gpu 2023.12.2+207 all Firmware package for Intel integrated and discrete GPUs
ii intel-gpu-tools 1.26-2 amd64 tools for debugging the Intel graphics driver
ii intel-i915-dkms 1.23.4.15.230307.15.5.17.0.1030+i28-1 all Out of tree i915 driver for Ubuntu oem kernel version 5.17.
ii intel-igc-cm 1.0.176+i600~22.04 amd64 Intel(R) C for Metal Compiler -- CM Frontend lib
ii intel-level-zero-gpu 1.3.26032.26-627~22.04 amd64 Intel(R) Graphics Compute Runtime for oneAPI Level Zero.
ii intel-media-va-driver-non-free:amd64 23.1.6-622~22.04 amd64 VAAPI driver for the Intel GEN8+ Graphics family
ii intel-microcode 3.20230214.0ubuntu0.22.04.1 amd64 Processor microcode firmware for Intel CPUs
ii intel-opencl-icd 23.13.26032.26-627~22.04 amd64 Intel graphics compute runtime for OpenCL
ii intel-platform-cse-dkms 2023.11.1-36 amd64 CSE driver
ii intel-platform-vsec-dkms 2023.20.0-3 amd64 Intel Extended Capabilities auxiliary bus driver
ii libdrm-intel1:amd64 2.4.113-2~ubuntu0.22.04.1 amd64 Userspace interface to intel-specific kernel DRM services -- runtime
ii xserver-xorg-video-intel 2:2.99.917+git20210115-1 amd64 X.Org X server -- Intel i8xx, i9xx display driver
@silee2 ,
I find in your log there is the triton__0d1d2d3d4d5d6d7d8d9d10d11d12d13d14d15d16d17d18d19d20d21d22d23d24d25d26d27d28d29d30d31d32d33d34d35d36d37d38d39d40d41d42d43d44d45d46d47d48d49d50d51d52d53d54d55d56d57d58d59d60d61d62d63d64d65d66d67d68d69d70d71d72d73d74d75d76d77d78d79d80d81d82d83d84d85d86d
It means the L0 module has been loaded correctly and we can iterate the kernel in the module.
But in my platform, the L0 module is created without the kernel.
One very large Triton kernel cannot be load correctly thru the L0 API. Got the error code
0x78000011
from L0 APIzeKernelCreate
.We double confirmed that the kernel name is used correctly same as the one in the SPIRV IR.
A simple c++ unit test for reproducing this issue. https://github.com/intel-innersource/frameworks.ai.pytorch.ipex-gpu/tree/chengjun/test_dpcpp You can use the following command to build and run the test under the root director of the code:
On ATSM platform result: