intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Other
1.26k stars 740 forks source link

DX12 read_write_unsampled_semaphore test failure #15851

Open Seanst98 opened 1 month ago

Seanst98 commented 1 month ago

Describe the bug

The test, read_write_unsampled_semaphore, non-deterministically fails with no useful output.

FAIL: SYCL :: bindless_images/dx12_interop/read_write_unsampled_semaphore.cpp (1 of 2)
******************** TEST 'SYCL :: bindless_images/dx12_interop/read_write_unsampled_semaphore.cpp' FAILED ********************
Exit Code: 3221226505
Command Output (stdout):
--
# RUN: at line 4
C:/gitlab-runner/builds/rm-hs3Zv/0/sean.stirling/intel-llvm-mirror-bi-ci/build/bin/clang++ -DWIN32 -D_WINDOWS -Werror  -fsycl -fsycl-targets=nvptx64-nvidia-cuda  C:\gitlab-runner\builds\rm-hs3Zv\0\sean.stirling\intel-llvm-mirror-bi-ci\sycl\test-e2e\bindless_images\dx12_interop\read_write_unsampled_semaphore.cpp -l d3d12 -l dxgi -l dxguid -o C:\gitlab-runner\builds\rm-hs3Zv\0\sean.stirling\intel-llvm-mirror-bi-ci\build\tools\sycl\test-e2e\bindless_images\dx12_interop\Output\read_write_unsampled_semaphore.cpp.tmp.out
# executed command: C:/gitlab-runner/builds/rm-hs3Zv/0/sean.stirling/intel-llvm-mirror-bi-ci/build/bin/clang++ -DWIN32 -D_WINDOWS -Werror -fsycl -fsycl-targets=nvptx64-nvidia-cuda 'C:\gitlab-runner\builds\rm-hs3Zv\0\sean.stirling\intel-llvm-mirror-bi-ci\sycl\test-e2e\bindless_images\dx12_interop\read_write_unsampled_semaphore.cpp' -l d3d12 -l dxgi -l dxguid -o 'C:\gitlab-runner\builds\rm-hs3Zv\0\sean.stirling\intel-llvm-mirror-bi-ci\build\tools\sycl\test-e2e\bindless_images\dx12_interop\Output\read_write_unsampled_semaphore.cpp.tmp.out'
# .---command stderr------------
# | warning: overriding the module target triple with x86_64-pc-windows-msvc19.40.33811 [-Woverride-module]
# | 1 warning generated.
# `-----------------------------
# RUN: at line 5
env SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT=1  C:\gitlab-runner\builds\rm-hs3Zv\0\sean.stirling\intel-llvm-mirror-bi-ci\build\tools\sycl\test-e2e\bindless_images\dx12_interop\Output\read_write_unsampled_semaphore.cpp.tmp.out
# executed command: env SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT=1 'C:\gitlab-runner\builds\rm-hs3Zv\0\sean.stirling\intel-llvm-mirror-bi-ci\build\tools\sycl\test-e2e\bindless_images\dx12_interop\Output\read_write_unsampled_semaphore.cpp.tmp.out'
# note: command had no output on stdout or stderr
# error: command failed with exit status: 0xc0000409
--

To reproduce

Reproduce by running the E2E test on a Windows machine.

Environment

$ .\bin\sycl-ls --verbose
[level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) UHD Graphics 770 12.2.0 [1.3.27359]
[opencl:gpu][opencl:0] Intel(R) OpenCL Graphics, Intel(R) UHD Graphics 770 OpenCL 3.0 NEO  [31.0.101.4953]
[cuda:gpu][cuda:0] NVIDIA CUDA BACKEND, NVIDIA GeForce RTX 4060 Ti 8.9 [CUDA 12.7]
Platforms: 3
Platform [#1]:
    Version  : 1.3
    Name     : Intel(R) oneAPI Unified Runtime over Level-Zero
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#0]:
        Type              : gpu
        Version           : 12.2.0
        Name              : Intel(R) UHD Graphics 770
        Vendor            : Intel(R) Corporation
        Driver            : 1.3.27359
        UUID              : 134128128167400002000000
        DeviceID          : 42880
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_limited_graph ext_oneapi_private_alloca ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem ext_oneapi_virtual_functions
        info::device::sub_group_sizes: 8 16 32
        Architecture: intel_gpu_adl_s
Platform [#2]:
    Version  : OpenCL 3.0 
    Name     : Intel(R) OpenCL Graphics
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#0]:
        Type              : gpu
        Version           : OpenCL 3.0 NEO 
        Name              : Intel(R) UHD Graphics 770
        Vendor            : Intel(R) Corporation
        Driver            : 31.0.101.4953
        UUID              : 134128128167400002000000
        DeviceID          : 2
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations atomic64 ext_intel_device_info_uuid ext_oneapi_srgb ext_intel_device_id ext_intel_legacy_image ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_private_alloca ext_oneapi_atomic16 ext_oneapi_virtual_functions
        info::device::sub_group_sizes: 8 16 32
        Architecture: intel_gpu_adl_s
Platform [#3]:
    Version  : CUDA 12.7
    Name     : NVIDIA CUDA BACKEND
    Vendor   : NVIDIA Corporation
    Devices  : 1
        Device [#0]:
        Type              : gpu
        Version           : 8.9
        Name              : NVIDIA GeForce RTX 4060 Ti
        Vendor            : NVIDIA Corporation
        Driver            : CUDA 12.7
        UUID              : 1810323858799654106201157113010250248207
        DeviceID          : 0
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address atomic64 ext_intel_device_info_uuid ext_oneapi_native_assert ext_oneapi_cuda_async_barrier ext_intel_free_memory ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_widthImages are not fully supported by the CUDA BE, their support is disabled by default. Their partial support can be activated by setting SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT environment variable at runtime.
 ext_oneapi_bindless_images ext_oneapi_bindless_images_shared_usm ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_oneapi_external_memory_import ext_oneapi_external_semaphore_import ext_oneapi_mipmap ext_oneapi_mipmap_anisotropy ext_oneapi_mipmap_level_reference ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_graph ext_oneapi_limited_graph ext_oneapi_cubemap ext_oneapi_cubemap_seamless_filtering ext_oneapi_bindless_sampled_image_fetch_1d_usm ext_oneapi_bindless_sampled_image_fetch_2d_usm ext_oneapi_bindless_sampled_image_fetch_2d ext_oneapi_bindless_sampled_image_fetch_3d ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem ext_oneapi_image_array ext_oneapi_unique_addressing_per_dim ext_oneapi_bindless_images_sample_1d_usm ext_oneapi_bindless_images_sample_2d_usm
        info::device::sub_group_sizes: 32
        Architecture: nvidia_gpu_sm_89
default_selector()      : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) UHD Graphics 770 12.2.0 [1.3.27359]
accelerator_selector()  : No device of requested type available.
cpu_selector()          : No device of requested type available.
gpu_selector()          : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) UHD Graphics 770 12.2.0 [1.3.27359]
custom_selector(gpu)    : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) UHD Graphics 770 12.2.0 [1.3.27359]
custom_selector(cpu)    : No device of requested type available.
custom_selector(acc)    : No device of requested type available.

Additional context

We are going to XFAIL the test for now, until a proper fix has been implemented.

Seanst98 commented 4 weeks ago

See this XFAIL PR: https://github.com/intel/llvm/pull/15875