Open bader opened 4 months ago
AFAIK, this test is not known to be flaky. It is also passing for the latest couple of post-commit runs. Just in case, i'll run it locally with the repo state set to the commit that saw the post-commit failure.
This is passing locally:
lbushi@scsel-tl-02:~/sycl_workspace/llvm/build/bin$ ./sycl-ls
INFO: Output filtered by ONEAPI_DEVICE_SELECTOR environment variable, which is set to level_zero:gpu.
To see device ids, use the --ignore-device-selectors CLI option.
[level_zero:gpu] Intel(R) Level-Zero, Intel(R) Iris(R) Xe Graphics 12.0.0 [1.3.28202]
lbushi@scsel-tl-02:~/sycl_workspace/llvm/build/bin$ llvm-lit -v ../../sycl/test-e2e/forward_progress/forward_progress_kernel_param_L0_gpu.cpp
llvm-lit: /nfs/site/home/lbushi/sycl_workspace/llvm/sycl/test-e2e/lit.cfg.py:414: note: Targeted devices: all
INFO: Output filtered by ONEAPI_DEVICE_SELECTOR environment variable, which is set to level_zero:gpu.
To see device ids, use the --ignore-device-selectors CLI option.
llvm-lit: /nfs/site/home/lbushi/sycl_workspace/llvm/sycl/test-e2e/lit.cfg.py:632: note: Found pre-installed AOT device compiler ocloc
llvm-lit: /nfs/site/home/lbushi/sycl_workspace/llvm/sycl/test-e2e/lit.cfg.py:632: note: Found pre-installed AOT device compiler opencl-aot
llvm-lit: /nfs/site/home/lbushi/sycl_workspace/llvm/sycl/test-e2e/lit.cfg.py:733: note: Aspects for level_zero:gpu: ext_oneapi_opportunistic_group, ext_intel_esimd, ext_intel_device_id, atomic64, ext_intel_gpu_slices, ext_oneapi_bindless_images_shared_usm, usm_host_allocations, ext_intel_pci_address, ext_oneapi_private_alloca, ext_intel_gpu_eu_count_per_subslice, usm_shared_allocations, ext_intel_gpu_hw_threads_per_eu, ext_oneapi_limited_graph, online_compiler, queue_profiling, ext_intel_gpu_eu_count, online_linker, usm_device_allocations, gpu, ext_oneapi_virtual_mem, ext_oneapi_mipmap, fp16, ext_intel_device_info_uuid, ext_intel_legacy_image, ext_intel_memory_clock_rate, ext_intel_gpu_subslices_per_slice, ext_oneapi_mipmap_anisotropy, ext_oneapi_ballot_group, ext_oneapi_tangle_group, ext_oneapi_fixed_size_group, ext_intel_gpu_eu_simd_width, ext_oneapi_queue_profiling_tag, ext_oneapi_bindless_images_2d_usm, ext_intel_memory_bus_width, ext_oneapi_bindless_images
llvm-lit: /nfs/site/home/lbushi/sycl_workspace/llvm/sycl/test-e2e/lit.cfg.py:745: note: SG sizes for level_zero:gpu: 32, 8, 16
llvm-lit: /nfs/site/home/lbushi/sycl_workspace/llvm/sycl/test-e2e/lit.cfg.py:754: note: Architectures for level_zero:gpu: intel_gpu_tgllp
-- Testing: 1 tests, 1 workers --
PASS: SYCL :: forward_progress/forward_progress_kernel_param_L0_gpu.cpp (1 of 1)
Testing Time: 3.84s
Total Discovered Tests: 1
Passed: 1 (100.00%)
lbushi@scsel-tl-02:~/sycl_workspace/llvm/build/bin$ git show HEAD
commit 8bf7ae39fbbc2bca93b63c511b3350a0b2da9ab1 (HEAD)
Author: Alexey Bader <alexey.bader@intel.com>
Date: Mon Jul 22 12:43:22 2024 -0700
[CODEOWNERS] Fix merge conflict with community change. (#14688)
diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
index 90177ad8d05e..772af752ea60 100644
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -47,12 +47,6 @@ sycl/test-e2e/Plugin/dll-detach-order.cpp @intel/llvm-reviewers-runtime
sycl/plugins/**/cuda/ @intel/llvm-reviewers-cuda
sycl/plugins/**/hip/ @intel/llvm-reviewers-cuda
-# Transform Dialect in MLIR.
-/mlir/include/mlir/Dialect/Transform/* @ftynse @nicolasvasilache
-/mlir/lib/Dialect/Transform/* @ftynse @nicolasvasilache
-/mlir/**/*TransformOps* @ftynse @nicolasvasilache
-
-
# CUDA specific runtime implementations
sycl/include/sycl/ext/oneapi/experimental/cuda/ @intel/llvm-reviewers-cuda
lbushi@scsel-tl-02:~/sycl_workspace/llvm/build/bin$
Closing as I cannot reproduce and I don't see this failure in recent post-commits.
@lbushi25 Saw this again today here, maybe you can take another look and see if it's sporadic? Thanks
# RUN: at line 2
/__w/llvm/llvm/toolchain/bin//clang++ -Werror -fsycl -fsycl-targets=spir64 /__w/llvm/llvm/llvm/sycl/test-e2e/forward_progress/forward_progress_kernel_param_L0_gpu.cpp -o /__w/llvm/llvm/build-e2e/forward_progress/Output/forward_progress_kernel_param_L0_gpu.cpp.tmp.out
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -Werror -fsycl -fsycl-targets=spir64 /__w/llvm/llvm/llvm/sycl/test-e2e/forward_progress/forward_progress_kernel_param_L0_gpu.cpp -o /__w/llvm/llvm/build-e2e/forward_progress/Output/forward_progress_kernel_param_L0_gpu.cpp.tmp.out
# note: command had no output on stdout or stderr
# RUN: at line 3
env ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/forward_progress/Output/forward_progress_kernel_param_L0_gpu.cpp.tmp.out
# executed command: env ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/forward_progress/Output/forward_progress_kernel_param_L0_gpu.cpp.tmp.out
# .---command stderr------------
# | terminate called after throwing an instance of 'sycl::_V1::exception'
# | what(): UR error
# `-----------------------------
# error: command failed with exit status: -6
--
********************
Just encountered this failure: https://github.com/intel/llvm/actions/runs/10718850999/job/29725040213
It passed again on a retry.
Describe the bug
Log from post-commit results for https://github.com/intel/llvm/commit/8bf7ae39fbbc2bca93b63c511b3350a0b2da9ab1 (non-functional change).
Full log: logs_26294248843 (1).zip, GitHub Actions Link.
To reproduce
No response
Environment
Additional context
No response