intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Other
1.21k stars 725 forks source link

PI port: post merge work #14598

Open aarongreig opened 1 month ago

aarongreig commented 1 month ago

I'm creating this issue to capture any work that will need to be done post-merge, inevitably we've had to leave some stuff unfinished to make the ABI breaking window.

For reference, this is the PI merge change: https://github.com/intel/llvm/pull/14145

Tasks from review

Regressions

This is the list of known regressions that have had XFAIL added to them in anticipation of post-merge fixes.

From sycl/test/:

(all but a handful of these will be fixed by the inclusion of https://github.com/oneapi-src/unified-runtime/pull/1871)

From sycl/test-e2e

Windows only:

Cuda only:

New regressions as of merge on 24/07

unittests:

New regressions as of merge on 26/07

It seems we have also introduced some regressions in the sycl cts, so:

omarahmed1111 commented 1 month ago

Basic/aspects.cpp is working normally and passing with the PI2UR PR, it also does not have an XFAIL on the PR, it only had a comment directing to this issue which should be deleted.

KornevNikita commented 1 month ago

@omarahmed1111 could you please clarify the status of these tests?

 AddressSanitizer/common/kernel-debug.cpp
 AddressSanitizer/multiple-reports/multiple_kernels.cpp
 AddressSanitizer/multiple-reports/one_kernel.cpp
 AddressSanitizer/use-after-free/quarantine-free.cpp

They're failing in Nightly, but there is no issue for them. Should we create one?

aarongreig commented 1 month ago

Ah it seems we missed xfailing those, they are affected by the same issue as https://github.com/intel/llvm/issues/14658. We do have a fix https://github.com/oneapi-src/unified-runtime/pull/1883 but I think we're prioritizing merging feature PRs for now so I'll open a PR to xfail them to unblock the nightly.

KornevNikita commented 1 month ago

Ah it seems we missed xfailing those, they are affected by the same issue as #14658. We do have a fix oneapi-src/unified-runtime#1883 but I think we're prioritizing merging feature PRs for now so I'll open a PR to xfail them to unblock the nightly.

Ok, good.

After port there're also failures in SYCL-CTS: https://github.com/intel/llvm/actions/runs/10137365848/job/28027872635 All like this:

  /__w/llvm/llvm/khronos_sycl_cts/tests/queue/queue_shortcuts_explicit_core.cpp:30: FAILED:
    {Unknown expression after the reported line}
  due to unexpected exception with messages:
    for type "char": 
    SYCL exception
    with category name: 'sycl'
    with code: 'runtime'
    with code raw value: '1'
    with code message: 'SYCL Error'
    with what: 'Native API failed. Native API returns: 45
    (UR_RESULT_ERROR_INVALID_ARGUMENT)'

Do you know if it's some known issue?

aarongreig commented 1 month ago

It isn't a known problem, I've added an issue to the OP to investigate. I have seen fails like that before so it's very likely that a fix for one of the remaining e2e regressions will also fix the cts.

uditagarwal97 commented 1 month ago

FYI, while enabling E2E tests on PVC (https://github.com/intel/llvm/pull/14720), I see a failure in Matrix/SG32/element_wise_all_ops.cpp, which could be related to PI2UR work. Similar to the failure in Matrix/element_wise_all_ops.cpp (https://github.com/intel/llvm/issues/14795).

kbenzie commented 1 month ago

Ah it seems we missed xfailing those, they are affected by the same issue as #14658. We do have a fix oneapi-src/unified-runtime#1883 but I think we're prioritizing merging feature PRs for now so I'll open a PR to xfail them to unblock the nightly.

Ok, good.

After port there're also failures in SYCL-CTS: https://github.com/intel/llvm/actions/runs/10137365848/job/28027872635 All like this:

  /__w/llvm/llvm/khronos_sycl_cts/tests/queue/queue_shortcuts_explicit_core.cpp:30: FAILED:
    {Unknown expression after the reported line}
  due to unexpected exception with messages:
    for type "char": 
    SYCL exception
    with category name: 'sycl'
    with code: 'runtime'
    with code raw value: '1'
    with code message: 'SYCL Error'
    with what: 'Native API failed. Native API returns: 45
    (UR_RESULT_ERROR_INVALID_ARGUMENT)'

Do you know if it's some known issue?

@KornevNikita https://github.com/intel/llvm/pull/14873 should fix the SYCL CTS, it has been confirmed to fix 2 of the 3 failures at the time of writing but the run has not completed.

KornevNikita commented 1 month ago

@kbenzie thanks, I'll check it manually

KornevNikita commented 1 month ago

@kbenzie thanks, I'll check it manually

yep, that helps, thanks.