intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Other
1.2k stars 712 forks source link

[SYCL][Flaky] SYCL :: USM/copy2d.cpp failed on Level Zero Windows on unrelated changes #8126

Open dm-vodopyanov opened 1 year ago

dm-vodopyanov commented 1 year ago

Failure: PR https://github.com/intel/llvm/pull/8056. Meanwhile, the same intel/llvm PR was checked with the same intel/llvm-test-suite PR on llvm-test-suite side, and the test passed.

******************** TEST 'SYCL :: USM/copy2d.cpp' FAILED ********************
Script:
--
: 'RUN: at line 9';    Test_Suite/llvm.obj/install/bin/clang-cl.exe /EHsc     -fsycl -fsycl-device-code-split=per_kernel -fsycl-targets=spir64  Test_Suite\llvm-test-suite\SYCL\USM\copy2d.cpp -o Test_Suite\build\SYCL\USM\Output\copy2d.cpp.tmp.out
: 'RUN: at line 10';   true Test_Suite\build\SYCL\USM\Output\copy2d.cpp.tmp.out
: 'RUN: at line 11';    env ONEAPI_DEVICE_SELECTOR=ext_oneapi_level_zero:gpu  Test_Suite\build\SYCL\USM\Output\copy2d.cpp.tmp.out
: 'RUN: at line 12';   true Test_Suite\build\SYCL\USM\Output\copy2d.cpp.tmp.out
--
Exit Code: 1

Command Output (stdout):
--
$ ":" "RUN: at line 9"
note: command had no output on stdout or stderr
$ "Test_Suite/llvm.obj/install/bin/clang-cl.exe" "/EHsc" "-fsycl" "-fsycl-device-code-split=per_kernel" "-fsycl-targets=spir64" "Test_Suite\llvm-test-suite\SYCL\USM\copy2d.cpp" "-o" "Test_Suite\build\SYCL\USM\Output\copy2d.cpp.tmp.out"
note: command had no output on stdout or stderr
$ ":" "RUN: at line 10"
note: command had no output on stdout or stderr
$ "true" "Test_Suite\build\SYCL\USM\Output\copy2d.cpp.tmp.out"
note: command had no output on stdout or stderr
$ ":" "RUN: at line 11"
note: command had no output on stdout or stderr
$ "env" "ONEAPI_DEVICE_SELECTOR=ext_oneapi_level_zero:gpu" "Test_Suite\build\SYCL\USM\Output\copy2d.cpp.tmp.out"
# command output:
Test 7 (device USM allocation, shortcut with no dependency events)
Value at 2464 did not match the expected value; 1024 != 1234

error: command failed with exit status: 1
aelovikov-intel commented 1 month ago

I think relevant PR that disabled the tests is https://github.com/intel/llvm-test-suite/pull/1689, even though it doesn't mention this issue. Tests under https://github.com/intel/llvm/tree/sycl/sycl/test-e2e/USM/memops2d are disabled on Windows still.

aelovikov-intel commented 4 weeks ago

The issue seems to be present still: https://github.com/intel/llvm/actions/runs/9454117594/job/26146692998?pr=14128.

aelovikov-intel commented 3 weeks ago

Reduced test case, not sure if other USM combinations (src/dst) are affected:

#include <sycl/sycl.hpp>

using namespace sycl;

constexpr size_t RECT_WIDTH = 30;
constexpr size_t RECT_HEIGHT = 21;

int main() {
  queue Q;
  for (int iter = 0; iter < 1000; ++iter) {
    int ExpectedVal = 42;

    constexpr size_t SRC_ELEMS = RECT_WIDTH * RECT_HEIGHT;
    constexpr size_t DST_ELEMS = SRC_ELEMS;

    auto *USMMemSrc = malloc_shared<int>(SRC_ELEMS, Q);
    auto *USMMemDst = (int *)malloc(DST_ELEMS * sizeof(int));

    Q.fill(USMMemSrc, ExpectedVal, SRC_ELEMS).wait();
    std::fill(USMMemDst, USMMemDst + DST_ELEMS, 0);
    Q.ext_oneapi_copy2d(USMMemSrc, RECT_WIDTH, USMMemDst, RECT_WIDTH,
                        RECT_WIDTH, RECT_HEIGHT)
        .wait();

    for (size_t I = 0; I < DST_ELEMS; ++I) {
      if (USMMemDst[I] != ExpectedVal) {
        std::cout << "Failed at iteration " << iter << std::endl;
        std::cout << "Value at " << I << " did not match the expected value; "
                  << USMMemDst[I] << " != " << ExpectedVal << std::endl;
        free(USMMemSrc, Q);
        free(USMMemDst);
        return 1;
      }
    }

    free(USMMemSrc, Q);
    free(USMMemDst);
  }

  return 0;
}
bash-3.2$ clang++ -fsycl a.cpp -o a.exe -O3
bash-3.2$ ( export A="" ; while ONEAPI_DEVICE_SELECTOR=level_zero:gpu ./a.exe ; do export A="f$A" ; done ; echo $A | wc )
Failed at iteration 814
Value at 132 did not match the expected value; 0 != 42
      1       1      14
aelovikov-intel commented 3 weeks ago

2D operations aren't necessary, USM->system memory goes through fallback path so they same can be reproduced like this:

#include <chrono>
#include <sycl/sycl.hpp>

using namespace sycl;

constexpr size_t RECT_WIDTH = 30;

// Can't reproduce with <=9, original test had 21.
constexpr size_t RECT_HEIGHT = 10;

int main(int argc, char *argv[]) {
  int N = argc == 1 ? 1 : std::atoi(argv[1]);
  queue Q;
  for (int iter = 0; iter < N; ++iter) {
    int ExpectedVal = 42;

    constexpr size_t SRC_ELEMS = RECT_WIDTH * RECT_HEIGHT;
    constexpr size_t DST_ELEMS = SRC_ELEMS;

    auto *USMMemSrc = malloc_device<int>(SRC_ELEMS, Q);
    auto *USMMemDst = (int *)malloc(DST_ELEMS * sizeof(int));

    Q.fill(USMMemSrc, ExpectedVal, SRC_ELEMS).wait();
    std::fill(USMMemDst, USMMemDst + DST_ELEMS, 0);

    std::vector<event> events;
    for (int row = 0; row < RECT_HEIGHT; ++row) {
      events.push_back(Q.copy(USMMemSrc + row * RECT_WIDTH,
                              USMMemDst + row * RECT_WIDTH, RECT_WIDTH));
    }

    sycl::event::wait(events);

    for (size_t I = 0; I < DST_ELEMS; ++I) {
      if (USMMemDst[I] != ExpectedVal) {
        std::cout << "Failed at iteration " << iter << std::endl;
        std::cout << "Value at " << I << " did not match the expected value; "
                  << USMMemDst[I] << " != " << ExpectedVal << std::endl;
        using namespace std::chrono_literals;
        std::this_thread::sleep_for(1s);
        if (USMMemDst[I] != ExpectedVal) {
          std::cout << "Still different after sleep: " << USMMemDst[I]
                    << std::endl;
        } else {
          std::cout << "As expected after sleep" << std::endl;
        }
        std::cout << "Src: " << USMMemSrc << std::endl;
        std::cout << "Dst: " << USMMemDst << std::endl;
        free(USMMemSrc, Q);
        free(USMMemDst);
        return 1;
      }
    }

    free(USMMemSrc, Q);
    free(USMMemDst);
  }

  return 0;
}
bash-3.2$ clang++ -fsycl a.cpp -o a.exe -O0
bash-3.2$ ( export A="" ; while ONEAPI_DEVICE_SELECTOR=level_zero:gpu ./a.exe 10000 ; do export A="f$A" ; done ; echo $A | wc )
Failed at iteration 2589
Value at 84 did not match the expected value; 0 != 42
Still different after sleep: 0
Src: FFFFB80200010000
Dst: 000001E13D299EB0
      1       1       3

Cannot reproduce the issue if tracing is enabled.

aelovikov-intel commented 3 weeks ago

Transferred internal bug report to L0 folks.