Open jonathan-ramsey opened 1 year ago
Update: I got it to work (finally!). I am posting my solution here in the hope it will help others.
1) I modified the CMAKE_CXX_LINK_EXECUTABLE
variable in my CMakeLists.txt, inspired by what oneAPI/icx.exe does as well as this issue on the CMake gitlab: https://gitlab.kitware.com/cmake/cmake/-/issues/24243
if (CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
message(STATUS "Resetting linking command...")
set(CMAKE_CXX_LINK_EXECUTABLE "${_CMAKE_VS_LINK_EXE}<CMAKE_CXX_COMPILER> ${CMAKE_CL_NOLOGO}
<CMAKE_CXX_LINK_FLAGS> <OBJECTS> ${CMAKE_START_TEMP_FILE} /link <LINK_FLAGS> <LINK_LIBRARIES>
/out:<TARGET> /pdb:<TARGET_PDB> /version:<TARGET_VERSION_MAJOR>.<TARGET_VERSION_MINOR>
${_PLATFORM_LINK_FLAGS} ${CMAKE_END_TEMP_FILE}")
endif()
This doesn't actually seem to be a bug in the intel/llvm code, but rather that CMake detects the compiler as vanilla Clang rather than Clang w/SYCL. When using the oneAPI toolkit, CMake recognizes Intel/LLVM and does the right thing. I am running CMake 3.27.
2) Whereas icx.exe tolerates mixing of compiler flags and linker flags (e.g. /Qoption,link,/machine:x64 -fsycl
), the self-built intel/llvm Clang does NOT. Anything appearing after /link
is passed to the linker. As such, I had to modify the CMAKE_CXX_LINK_FLAGS
variable to ensure clang-cl.exe knew I wanted to link a SYCL program rather than just trying to pass the options to the linker (which were ignored!; see below). For example:
`set(CMAKE_CXX_LINK_FLAGS "-fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64-unknown-unknown")`
target_link_options
) but that doesn't put the options in the right place.Heads up: Using Ninja as the build generator for CMake generally causes warnings (but not errors) at link time to be suppressed, which was 80% of my problem. This appears to be a known issue: https://github.com/ninja-build/ninja/issues/1537
For example, using CMake+Ninja (cmake --build . --clean-first --config Debug -v
) to build the vector-add-buffers
sample from the oneAPI-samples repository prints out the following without errors or warnings during the link step:
> [2/2]"cmd.exe /C "cd . && "C:\Program Files\CMake\bin\cmake.exe" -E vs_link_exe --intdir=CMakeFiles\vector-add-buffers.dir\Debug
--rc=C:\PROGRA~2\WI3CF2~1\10\bin\100220~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100220~1.0\x64\mt.exe
--manifests -- C:\dev\sycl_workspace\llvm\build\bin\lld-link.exe /nologo CMakeFiles\vector-add-buffers.dir\Debug\src\vector-add-buffers.cpp.obj
/out:Debug\vector-add-buffers.exe /implib:Debug\vector-add-buffers.lib /pdb:Debug\vector-add-buffers.pdb /version:0.0
-g /machine:x64 /debug /INCREMENTAL /subsystem:console -fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64-unknown-unknown
-LIBPATH:C:\dev\sycl_workspace\llvm\build\lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib
uuid.lib comdlg32.lib advapi32.lib && cd ."
If one runs the linking step directly, i.e.:
> C:\dev\sycl_workspace\llvm\build\bin\lld-link.exe /nologo CMakeFiles\vector-add-buffers.dir\Debug\src\vector-add-buffers.cpp.obj
/out:Debug\vector-add-buffers.exe /implib:Debug\vector-add-buffers.lib /pdb:Debug\vector-add-buffers.pdb /version:0.0
-g /machine:x64 /debug /INCREMENTAL /subsystem:console -fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64-unknown-unknown
-LIBPATH:C:\dev\sycl_workspace\llvm\build\lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib
uuid.lib comdlg32.lib advapi32.lib
then the problem is much more apparent:
lld-link: warning: ignoring unknown argument '-g'
lld-link: warning: ignoring unknown argument '-fsycl'
lld-link: warning: ignoring unknown argument '-fsycl-targets=nvptx64-nvidia-cuda,spir64-unknown-unknown'
My workaround: Adding /WX
to either CMAKE_CXX_LINK_FLAGS
or target_link_options
causes warnings about ignored options to be treated as errors.
@jonathan-ramsey I'm glad this is working for you now. If I understand correctly, this is not really a bug in intel/llvm, but a few other issues (CMake, Ninja) stacked together. Did I understand correctly? If so, are you okay with closing the issue?
@maarquitos14: It is not a bug in intel/llvm itself. However, one might argue that it should be documented for the next person who tries to use intel/llvm on Windows with CUDA, so then perhaps a request for enhancement? 😄 Otherwise, I am okay with closing the issue.
@jonathan-ramsey thanks for the quick reply and sorry for the delay. I have been trying to reproduce your issue but I couldn't. Could you please summarize the commands to reproduce the issue? Thank you in advance.
@maarquitos14: Sorry for the delay in replying!
Included are copies of CMakeLists.txt
and src\vector-add-buffers.cpp
for which I can reproduce the error on my local machine. The vector-add-buffers.cpp
file should be in a directory called src
inside whichever directory the CMakeLists.txt
resides in.
Building:
mkdir build
cd build
cmake -G"Ninja" .. --fresh -DCMAKE_CXX_COMPILER=clang.cl.exe
cmake --build . --clean-first --config Debug -v
set ONEAPI_DEVICE_SELECTOR=opencl:cpu
If everything built okay, then (in the build
directory) execute vector-add-buffers.exe
.
If I comment out line 8 of CMakeLists.txt, then I get the "No kernel named ... found -46 (PI_ERROR_INVALID_KERNEL_NAME)" error at runtime.
If I include line 8 of the CMakeLists.txt, and do a clean build, the test code runs successfully without error and gives the expected output.
Note that I was (and still am) using the "nightly-2023-09-28" build (47083f847f) of the sycl
branch of the intel/llvm
repository, and I have built the compiler to include NVIDIA CUDA support.
I am running Windows 10 Pro with Visual Studio Professional 2022.
Please let me know how it goes. Cheers.
When trying to build any SYCL-enabled project with CMake and the compiler built from the
sycl
branch of intel/llvm (see below), the successfully compiled binary kernel throws an error along the lines of "No kernel named (mangled name) was found -46 (PI_ERROR_INVALID_KERNEL_NAME)" while trying to execute any SYCL kernel.I can get the error to arise even with the
vector-add-buffers
example in the oneAPI-samples repository.However, if I use an installed copy of oneAPI Base toolkit (version 2023.2.1), then the example (and indeed other SYCL-enabled projects) work fine!
After digging around, the consensus is that I should be using the compiler (
clang-cl.exe
) to link, while the default build process useslld-link.exe
. Okay, so I did that and usedclang-cl.exe
to link with the appropriate options instead, but it does not fix the issue.After carefully comparing the build process for the oneAPI toolkit versus the self-built clang/llvm from this repo, it would seem that the linking step of the oneAPI toolkit build (which is using
icx.exe
) is doing many more things than when I link using the self-built clang-cl.exe or lld-link.exe (e.g.icx.exe
makes multiple calls toclang-offload-builder
).Another big indicator that things are different is that the oneAPI toolkit built executable is almost 4 times larger than the executable from the self-built clang compiler.
Important note: In the case of the
vector-add-buffers
oneAPI sample, if I forego CMake and compile just the single source code file in a one-liner (e.g.clang-cl -fsycl /EHsc vector-add-buffers.cpp -o vector-add-buffers.exe
), then it does work as expected. Can anyone tell me what are the additional steps I need to take to get the linking step of the self-built clang to behave like the oneAPI toolkit?An alternative way to phrase this might be, how do I deploy the self-built clang/DPC++ toolchain on my local system?
Environment:
set ONEAPI_DEVICE_SELECTOR=opencl:cpu
to force calculation on the CPU. CMakeLists.txtP.S. If you're wondering why I'd want to use SYCL to target CPUs, it is only a stepping stone to NVIDIA GPU offloading...once I can get things working...
P.P.S. I have tried other commits, including a nightly from late last week, and one from the end of June matching the last release date of the oneAPI toolkit, but the problem persists.
Thank you in advance for any help or suggestions you can give!
My slightly modified
vector-add-buffers.cpp
and correspondingCMakelists.txt
are attached. vector-add-buffers.txt CMakeLists.txt