intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Other
1.26k stars 743 forks source link

sycl spec make_device not working #5769

Open ye-luo opened 2 years ago

ye-luo commented 2 years ago

Describe the bug auto device = sycl::make_device<sycl::backend::ext_oneapi_level_zero>((_ze_device_handle_t*)hDevice); fails at run.

terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  Native API failed. Native API returns: -30 (CL_INVALID_VALUE) -30 (CL_INVALID_VALUE

I have to do non-portable.

const sycl::platform sycl_platform=sycl::ext::oneapi::level_zero::make_platform(reinterpret_cast<pi_native_handle>(hPlatform));
auto device = sycl::ext::oneapi::level_zero::make_device(sycl_platform, reinterpret_cast<pi_native_handle>(hDevice));

it does work.

both above need me to include

#include <level_zero/ze_api.h>

This adds another level of complexity since this file is not shipped by the compiler but level-zero-dev package.

Another one I tried is

auto device = sycl::detail::make_device(reinterpret_cast<pi_native_handle>(hDevice), sycl::backend::ext_oneapi_level_zero);

it doesn't need ze_api.h header file. However, I got the same -30 error.

Does level-zero have spec compliant interoperability API implementation? Or I have to rely on the extension?

ye-luo commented 2 years ago

A bit more background, I was testing interop with OpenMP from icpx

                auto hPlatform = omp_get_interop_ptr(o, omp_ipr_platform, &err);
                auto hContext = omp_get_interop_ptr(o, omp_ipr_device_context, &err);
                auto hDevice =  omp_get_interop_ptr(o, omp_ipr_device, &err);
TApplencourt commented 2 years ago

Full reproducer:

#include <sycl/sycl.hpp>
#include <omp.h>

int main() {
  omp_interop_t o = 0;
  #pragma omp interop init(targetsync: o)
  int err = -1;
  const auto hDevice =  static_cast<ze_device_handle_t>(omp_get_interop_ptr(o, omp_ipr_device, &err));
  assert (err >= 0 && "omp_get_interop_ptr(omp_ipr_device)");
  sycl::device D = sycl::make_device<sycl::backend::ext_oneapi_level_zero>(hDevice);
  return 0;
}

Compile

icpx -fiopenmp -fopenmp-targets=spir64 -fsycl interopt.cpp

Run

./a.out
terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  Native API failed. Native API returns: -30 (CL_INVALID_VALUE) -30 (CL_INVALID_VALUE)
Aborted
alexbatashev commented 2 years ago

@TApplencourt are you sure omp_get_interop_ptr returns ze_device_handle_t? We have a similarly simple test and it passes: https://github.com/intel/llvm-test-suite/blob/intel/SYCL/Plugin/interop-level-zero.cpp

TApplencourt commented 2 years ago

I think so. I found a workaround, this may help you diagnose what is going on internally. Maybe some L0 objects are not initialized when calling only make_device? (Note that I don't use the sycl::platform explicitly when creating the sycl::device)

#include <sycl/sycl.hpp>
#include <omp.h>

int main() {
  omp_interop_t o = 0;
  #pragma omp interop init(targetsync: o)
  int err = -1;
#ifdef _WA
  const ze_driver_handle_t hPlatform = static_cast<ze_driver_handle_t>(omp_get_interop_ptr(o, omp_ipr_platform, &err));
  assert (err >= 0 && "omp_get_interop_ptr(omp_ipr_platform)");
#endif
  const auto hDevice =  static_cast<ze_device_handle_t>(omp_get_interop_ptr(o, omp_ipr_device, &err));
  assert (err >= 0 && "omp_get_interop_ptr(omp_ipr_device)");
  #pragma omp interop destroy(o)
#ifdef _WA
  const sycl::platform sycl_platform = sycl::make_platform<sycl::backend::ext_oneapi_level_zero>(hPlatform);
#endif
  sycl::device D = sycl::make_device<sycl::backend::ext_oneapi_level_zero>(hDevice);
  return 0;
}
$ icpx -fiopenmp -fopenmp-targets=spir64 -fsycl interopt.cpp -D_WA
$ ./a.out
$ icpx -fiopenmp -fopenmp-targets=spir64 -fsycl interopt.cpp
$ ./a.out
terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  Native API failed. Native API returns: -30 (CL_INVALID_VALUE) -30 (CL_INVALID_VALUE)
Aborted
ye-luo commented 2 years ago

@TApplencourt are you sure omp_get_interop_ptr returns ze_device_handle_t? We have a similarly simple test and it passes: https://github.com/intel/llvm-test-suite/blob/intel/SYCL/Plugin/interop-level-zero.cpp

Your example is sycl -> L0 -> sycl. There are likely implicit things in SYCL making it pass. We need real interop to work between OpenMP and SYCL.

TApplencourt commented 2 years ago

Correct what we want in a perfect world is:

#include <sycl/sycl.hpp>
#include <omp.h>

int main() {
  omp_interop_t o = 0;
  #pragma omp interop init(targetsync: o)
  int err = -1;
  const auto hPlatform = static_cast<pi_native_object>(omp_get_interop_ptr(o, omp_ipr_platform, &err));
  assert (err >= 0 && "omp_get_interop_ptr(omp_ipr_platform)");
   #pragma omp interop destroy(o)
  sycl::device D = sycl::make_device<sycl::backend::ext_oneapi_level_zero>(hDevice);
  return 0;
}
ye-luo commented 2 years ago

key requirement is to avoid

  1. any explicit L0 types like ze_device_handle_t. This should be hidden by pi_native_object and enums.
  2. any non standard APIs.
keryell commented 2 years ago

@TApplencourt it would be nice to have such a SYCL & OpenMP interop example for the SYCL presentation! :-)

TApplencourt commented 2 years ago

To please @keryell I did more tests, who seem to have discovered a few new bugs. @alexbatashev should I open new tickets for those? I can also prepare a longer write-up if useful.

The code lives here: https://github.com/argonne-lcf/HPC-Patterns/blob/main/sycl_omp_ze_interopt/interop_omp_ze_sycl.cpp And tests OpenMP <-> L0 <-> SYCL Interopt.

The short story is that using the now deprecated sycl::level_zer0::make<sycl::device> and friend the code work.

icpx -fiopenmp -fopenmp-targets=spir64 -fsycl interop_omp_ze_sycl.cpp
./a.out
OMP -> SYCL
   SYCL memcopy using OpenMP pointer
SYCL -> OMP
  OMP memcopy using SYCL pointer
Computation Done

When using the new free function sycl::make_device<sycl::backend::ext_oneapi_level_zero> and friend the code doesn't work.

  1. When changing

    const sycl::device sycl_device = sycl::level_zero::make<sycl::device>(sycl_platform, hDevice);

    to

    const sycl::device sycl_device = sycl::make_device<sycl::backend::ext_oneapi_level_zero>(hDevice);

    Trigger a ( icpx -fiopenmp -fopenmp-targets=spir64 -fsycl interop_omp_ze_sycl.cpp -DMAKE_DEVICE)

    terminate called after throwing an instance of 'cl::sycl::invalid_parameter_error'
    what():  Queue cannot be constructed with the given context and device as the context does not contain the given device. -33 (CL_INVALID_DEVICE)
    Aborted

    The new function doesn't take a platform argument but not sure it that matter

  2. Whene Changing

    const sycl::context sycl_context = sycl::ext::oneapi::level_zero::make<sycl::context>(sycl_devices, hContext,  sycl::ext::oneapi::level_zero::ownership::keep);

    to

    sycl::backend_input_t<sycl::backend::ext_oneapi_level_zero, sycl::context> hContextInteropInput = {hContext, sycl_devices};
    const sycl::context sycl_context = sycl::make_context<sycl::backend::ext_oneapi_level_zero>(hContextInteropInput);

    Make the code segfault (icpx -fiopenmp -fopenmp-targets=spir64 -fsycl interop_omp_ze_sycl.cpp -DMAKE_CONTEXT) . Look like make<sycl::context> doesn't have the KeepOwnership option anymore, maybe it's the problem.

alexbatashev commented 2 years ago

@TApplencourt it's fine, let's use this tracker.

+ @smaslov-intel, do you have any idea why these code samples do not work?

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be automatically closed in 30 days.

TApplencourt commented 2 years ago

The two problems are still here (sycl::make_device trigger a cl::sycl::invalid_parameter_error, and sycl::make_context trigger a segfault) with Intel(R) oneAPI DPC++/C++ Compiler 2022.1.0 (2022.x.0.20220629)

A newer compiler added a direct OpenMP <-> Sycl interopt (aka Sycl can read direcly the OpenMP object), so this bug is less important for our particular use case. But for portability reason, I think this bug still matter a little bit :)

sogartar commented 2 years ago

I also get the same error. Here is a test that reproduces the error https://github.com/sogartar/make_sycl_device_from_level_zero_device_test/commit/2ee5e501e172bf7d5a6d02d3dba958ac7cb1beee

KornevNikita commented 6 months ago

Hi! There have been no updates for at least the last 60 days, though the ticket has assignee(s).

@AlexeySachkov, could I ask you to take one of the following actions? :)

Thanks!

github-actions[bot] commented 4 months ago

Hi! There have been no updates for at least the last 60 days, though the issue has assignee(s).

@AlexeySachkov, could you please take one of the following actions:

Thanks!

github-actions[bot] commented 2 months ago

Hi! There have been no updates for at least the last 60 days, though the issue has assignee(s).

@AlexeySachkov, could you please take one of the following actions:

Thanks!

github-actions[bot] commented 2 weeks ago

Hi! There have been no updates for at least the last 60 days, though the issue has assignee(s).

@AlexeySachkov, could you please take one of the following actions:

Thanks!