intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Other
1.24k stars 736 forks source link

[E2E] Multiple Matrix/SPVCooperativeMatrix/* E2E tests XPASS'ing in post-commit, Arc #15278

Closed uditagarwal97 closed 1 month ago

uditagarwal97 commented 1 month ago

Describe the bug

Affected tests:

********************
Unexpectedly Passed Tests (14):
  SYCL :: Matrix/SPVCooperativeMatrix/element_wise_abc.cpp
  SYCL :: Matrix/SPVCooperativeMatrix/element_wise_all_ops.cpp
  SYCL :: Matrix/SPVCooperativeMatrix/element_wise_all_ops_1d.cpp
  SYCL :: Matrix/SPVCooperativeMatrix/element_wise_all_ops_1d_cont.cpp
  SYCL :: Matrix/SPVCooperativeMatrix/element_wise_all_ops_half.cpp
  SYCL :: Matrix/SPVCooperativeMatrix/element_wise_all_ops_int8.cpp
  SYCL :: Matrix/SPVCooperativeMatrix/element_wise_all_ops_int8_packed.cpp
  SYCL :: Matrix/SPVCooperativeMatrix/element_wise_all_ops_scalar.cpp
  SYCL :: Matrix/SPVCooperativeMatrix/element_wise_all_sizes.cpp
  SYCL :: Matrix/SPVCooperativeMatrix/element_wise_ops.cpp
  SYCL :: Matrix/SPVCooperativeMatrix/get_coord_float_matC.cpp
  SYCL :: Matrix/SPVCooperativeMatrix/get_coord_int8_matA.cpp
  SYCL :: Matrix/SPVCooperativeMatrix/get_coord_int8_matB.cpp
  SYCL :: Matrix/SPVCooperativeMatrix/joint_matrix_apply_bf16.cpp

I see these failures in https://github.com/intel/llvm/actions/runs/10692386081/job/29640892260?pr=14720#step:22:31 and https://github.com/intel/llvm/actions/runs/10692983484/job/29642595868

To reproduce

DPC++ commit: fad405cdfc39712fe2ebf2eb4dea54255ffa8347 GPU: Arc

Environment

sycl-ls --verbose output:

[leve[level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A750 Graphics 12.55.8 [1.3.30049.600000]
[opencl:gpu][opencl:0] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A750 Graphics OpenCL 3.0 NEO  [24.26.30049.6]
[opencl:cpu][opencl:1] Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900 OpenCL 3.0 (Build 0) [2024.18.6.0.02_160000]
[opencl:fpga][opencl:2] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2024.18.6.0.02_160000]
[native_cpu:cpu][native_cpu:0] SYCL_NATIVE_CPU, SYCL Native CPU 0.1 [0.0.0]

Platforms: 5
Platform [#1]:
    Version  : 1.3
    Name     : Intel(R) oneAPI Unified Runtime over Level-Zero
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#0]:
        Type              : gpu
        Version           : 12.55.8
        Name              : Intel(R) Arc(TM) A750 Graphics
        Vendor            : Intel(R) Corporation
        Driver            : 1.3.30049.600000
        UUID              : 13412816186800030000000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_oneapi_bindless_images ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_intel_matrix ext_oneapi_limited_graph ext_oneapi_private_alloca ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem
        info::device::sub_group_sizes: 8 [16](https://github.com/intel/llvm/actions/runs/10692983484/job/29642595868#step:17:17) 32
        Architecture: intel_gpu_acm_g10
Platform [#2]:
    Version  : OpenCL 3.0 
    Name     : Intel(R) OpenCL Graphics
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#0]:
        Type              : gpu
        Version           : OpenCL 3.0 NEO 
        Name              : Intel(R) Arc(TM) A750 Graphics
        Vendor            : Intel(R) Corporation
        Driver            : 24.26.30049.6
        UUID              : 13412816186800030000000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations atomic64 ext_intel_device_info_uuid ext_oneapi_srgb ext_intel_device_id ext_intel_legacy_image ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_intel_matrix ext_oneapi_private_alloca
        info::device::sub_group_sizes: 8 16 32
        Architecture: intel_gpu_acm_g10
Platform [#3]:
    Version  : OpenCL 3.0 LINUX
    Name     : Intel(R) OpenCL
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#1]:
        Type              : cpu
        Version           : OpenCL 3.0 (Build 0)
        Name              : 12th Gen Intel(R) Core(TM) i9-12900
        Vendor            : Intel(R) Corporation
        Driver            : 2024.18.6.0.02_160000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : cpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations usm_system_allocations usm_atomic_host_allocations usm_atomic_shared_allocations atomic64 ext_oneapi_srgb ext_oneapi_native_assert ext_intel_legacy_image ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_private_alloca
        info::device::sub_group_sizes: 4 8 16 32 64
        Architecture: x86_64
Platform [#4]:
    Version  : OpenCL 1.2 Intel(R) FPGA SDK for OpenCL(TM), Version 20.3
    Name     : Intel(R) FPGA Emulation Platform for OpenCL(TM)
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#2]:
        Type              : fpga
        Version           : OpenCL 1.2 
        Name              : Intel(R) FPGA Emulation Device
        Vendor            : Intel(R) Corporation
        Driver            : 2024.18.6.0.02_160000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : accelerator fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations usm_atomic_host_allocations usm_atomic_shared_allocations ext_oneapi_srgb ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_intel_fpga_task_sequence ext_oneapi_private_alloca
        info::device::sub_group_sizes: 4 8 16 32 64
        Architecture: unknown
Platform [#5]:
    Version  : 0.1
    Name     : SYCL_NATIVE_CPU
    Vendor   : tbd
    Devices  : 1
        Device [#0]:
        Type              : cpu
        Version           : 0.1
        Name              : SYCL Native CPU
        Vendor            : Intel(R) Corporation
        Driver            : 0.0.0
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : cpu fp16 fp64 queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations usm_system_allocations usm_atomic_host_allocations usm_atomic_shared_allocations atomic64
        info::device::sub_group_sizes: 1
        Architecture: unknown
default_selector()      : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A750 Graphics 12.55.8 [1.3.30049.600000]
accelerator_selector()  : fpga, Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2024.18.6.0.02_160000]
cpu_selector()          : cpu, Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-1[29](https://github.com/intel/llvm/actions/runs/10692983484/job/29642595868#step:17:30)00 OpenCL 3.0 (Build 0) [2024.18.6.0.02_160000]
gpu_selector()          : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A750 Graphics 12.55.8 [1.3.[30](https://github.com/intel/llvm/actions/runs/10692983484/job/29642595868#step:17:31)049.600000]
custom_selector(gpu)    : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A750 Graphics 12.55.8 [1.3.30049.600000]
custom_selector(cpu)    : cpu, Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900 OpenCL 3.0 (Build 0) [2024.18.6.0.02_160000]
custom_selector(acc)    : fpga, Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2024.18.6.0.02_160000]l_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A750 Graphics 12.55.8 [1.3.30049.600000]
[opencl:gpu][opencl:0] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A750 Graphics OpenCL 3.0 NEO  [24.26.30049.6]
[opencl:cpu][opencl:1] Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900 OpenCL 3.0 (Build 0) [2024.18.6.0.02_160000]
[opencl:fpga][opencl:2] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2024.18.6.0.02_160000]
[native_cpu:cpu][native_cpu:0] SYCL_NATIVE_CPU, SYCL Native CPU 0.1 [0.0.0]

Platforms: 5
Platform [#1]:
    Version  : 1.3
    Name     : Intel(R) oneAPI Unified Runtime over Level-Zero
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#0]:
        Type              : gpu
        Version           : 12.55.8
        Name              : Intel(R) Arc(TM) A750 Graphics
        Vendor            : Intel(R) Corporation
        Driver            : 1.3.30049.600000
        UUID              : 13412816186800030000000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_oneapi_bindless_images ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_intel_matrix ext_oneapi_limited_graph ext_oneapi_private_alloca ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem
        info::device::sub_group_sizes: 8 [16](https://github.com/intel/llvm/actions/runs/10692983484/job/29642595868#step:17:17) 32
        Architecture: intel_gpu_acm_g10
Platform [#2]:
    Version  : OpenCL 3.0 
    Name     : Intel(R) OpenCL Graphics
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#0]:
        Type              : gpu
        Version           : OpenCL 3.0 NEO 
        Name              : Intel(R) Arc(TM) A750 Graphics
        Vendor            : Intel(R) Corporation
        Driver            : 24.26.30049.6
        UUID              : 13412816186800030000000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations atomic64 ext_intel_device_info_uuid ext_oneapi_srgb ext_intel_device_id ext_intel_legacy_image ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_intel_matrix ext_oneapi_private_alloca
        info::device::sub_group_sizes: 8 16 32
        Architecture: intel_gpu_acm_g10
Platform [#3]:
    Version  : OpenCL 3.0 LINUX
    Name     : Intel(R) OpenCL
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#1]:
        Type              : cpu
        Version           : OpenCL 3.0 (Build 0)
        Name              : 12th Gen Intel(R) Core(TM) i9-12900
        Vendor            : Intel(R) Corporation
        Driver            : 2024.18.6.0.02_160000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : cpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations usm_system_allocations usm_atomic_host_allocations usm_atomic_shared_allocations atomic64 ext_oneapi_srgb ext_oneapi_native_assert ext_intel_legacy_image ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_private_alloca
        info::device::sub_group_sizes: 4 8 16 32 64
        Architecture: x86_64
Platform [#4]:
    Version  : OpenCL 1.2 Intel(R) FPGA SDK for OpenCL(TM), Version 20.3
    Name     : Intel(R) FPGA Emulation Platform for OpenCL(TM)
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#2]:
        Type              : fpga
        Version           : OpenCL 1.2 
        Name              : Intel(R) FPGA Emulation Device
        Vendor            : Intel(R) Corporation
        Driver            : 2024.18.6.0.02_160000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : accelerator fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations usm_atomic_host_allocations usm_atomic_shared_allocations ext_oneapi_srgb ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_intel_fpga_task_sequence ext_oneapi_private_alloca
        info::device::sub_group_sizes: 4 8 16 32 64
        Architecture: unknown
Platform [#5]:
    Version  : 0.1
    Name     : SYCL_NATIVE_CPU
    Vendor   : tbd
    Devices  : 1
        Device [#0]:
        Type              : cpu
        Version           : 0.1
        Name              : SYCL Native CPU
        Vendor            : Intel(R) Corporation
        Driver            : 0.0.0
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : cpu fp16 fp64 queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations usm_system_allocations usm_atomic_host_allocations usm_atomic_shared_allocations atomic64
        info::device::sub_group_sizes: 1
        Architecture: unknown
default_selector()      : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A750 Graphics 12.55.8 [1.3.30049.600000]
accelerator_selector()  : fpga, Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2024.18.6.0.02_160000]
cpu_selector()          : cpu, Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900 OpenCL 3.0 (Build 0) [2024.18.6.0.02_160000]
gpu_selector()          : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A750 Graphics 12.55.8 [1.3.30049.600000]
custom_selector(gpu)    : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A750 Graphics 12.55.8 [1.3.30049.600000]
custom_selector(cpu)    : cpu, Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900 OpenCL 3.0 (Build 0) [2024.18.6.0.02_160000]
custom_selector(acc)    : fpga, Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2024.18.6.0.02_160000]

Additional context

No response

uditagarwal97 commented 1 month ago

@MrSidims FYI. I think your PR (https://github.com/intel/llvm/pull/15272) fixes these.

MrSidims commented 1 month ago

The fix got merge, closing