Open jonasdelacour opened 2 weeks ago
Yes, oneDPL has a predefined execution policy (dpcpp_default
) that creates a SYCL queue at construction.
As a workaround, if that predefined policy is not used in the program, you can define the ONEDPL_USE_PREDEFINED_POLICIES
macro to zero before including any oneDPL header. Some details here: https://oneapi-src.github.io/oneDPL/macros.html#additional-macros
Including any <oneapi/dpl> header seems to construct a sycl::queue, which must not be done prior to fork() calls. If you do you get CUDA_ERROR_NOT_INITIALIZED when the child process attempts to destroy this queue and underlying CUDA context.
Here's a minimum example to reproduce this error:
Compiled with
icpx -fsycl
produces the following stack trace from valgrind:
icpx version:
Intel(R) oneAPI DPC++/C++ Compiler 2024.1.2 (2024.1.2.20240508)
codeplay plugin for Nvidia gpus version:oneapi-for-nvidia-gpus-2024.1.2-cuda-12.0
nvidia-smi output:OS:
Ubuntu 22.04.4 LTS