intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Other
1.22k stars 730 forks source link

Linking issue with NVIDIA and AMDGPU backends #13815

Open Aympab opened 4 months ago

Aympab commented 4 months ago

Describe the bug

Hello,

I am using DPC++ to build a SYCL2020 project. I am running through a docker container with OpenCL, NVIDIA and AMD GPU backends. I used to build DPC++ at commit hash 589824d and this works fine; the code is built using the flags -fsycl-targets=amd_gpu_gfx90a or -fsycl-targets=nvidia_gpu_sm_80 for the different GPUs. I am using these flags for compilation and linking through a CMake integration.

I recently tried to update the container using the latest release, and when I'm running the same commands, I get the following errors:

The error for the NVIDIA GPU: clang++: warning: linked binaries do not contain expected 'nvptx64-nvidia-cuda-sm_80-' target; found targets: 'nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-, nvptx64-nvidia-cuda--sm_80-' [-Wsycl-target]

The error for the AMD GPU: clang++: warning: linked binaries do not contain expected 'amdgcn-amd-amdhsa-gfx90a-' target; found targets: 'amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-' [-Wsycl-target]

The errors are happening at linking time, I tried to set different flags such as -fsycl-targets=nvptx64-nvidia-cuda but I get different errors.

For the NVIDIA GPU the warning is non blocking and clang++ generates a binary, which crashes at the first parallel_for invocation: 46 (PI_ERROR_INVALID_KERNEL_NAME), the mangled name of the kernel is not found.

For the AMD GPU, the warning then produces an error and no binary is generated: clang++: error: amdgcn-link command failed with exit code 254, here is the full dump:

clang++: warning: linked binaries do not contain expected 'amdgcn-amd-amdhsa-gfx90a-' target; found targets: 'amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-' [-Wsycl-target]
lld: /opt/sycl/source/llvm/llvm/lib/CodeGen/MachineFrameInfo.cpp:55: int llvm::MachineFrameInfo::CreateStackObject(uint64_t, llvm::Align, bool, const llvm::AllocaInst*, uint8_t): Assertion `Size != 0 && "Cannot allocate zero size stack objects!"' failed.
PLEASE submit a bug report to https://github.com/intel/llvm/issues and include the crash backtrace.
Stack dump:
0.  Program arguments: /opt/sycl/dpcpp/bin/lld -flavor gnu -m elf64_amdgpu --no-undefined -shared -plugin-opt=-amdgpu-internalize-symbols -plugin-opt=mcpu=gfx90a -plugin-opt=O3 --lto-CGO3 -o /tmp/main-gfx90a-868a2d-15f874.out /tmp/main-gfx90a-89311f-2a560a.o
1.  Running pass 'CallGraph Pass Manager' on module 'ld-temp.o'.
2.  Running pass 'Greedy Register Allocator' on function '@_ZTSZZN4AdvX15StraddledMalloc8adv_opt3ERN4sycl3_V15queueERNS2_6bufferIdLi3ENS2_6detail17aligned_allocatorIdEEvEERK9ADVParamsRKmENKUlRNS2_7handlerEE_clESH_EUlNS2_5groupILi3EEEE__with_offset'
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  lld             0x000055ac3a04fe6f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 63
1  lld             0x000055ac3a04d5c4
2  libpthread.so.0 0x000014e0cee8c420
3  libc.so.6       0x000014e0ce92900b gsignal + 203
4  libc.so.6       0x000014e0ce908859 abort + 299
5  libc.so.6       0x000014e0ce908729
6  libc.so.6       0x000014e0ce919fd6
7  lld             0x000055ac3b68b2cc llvm::MachineFrameInfo::CreateSpillStackObject(unsigned long, llvm::Align) + 1436
8  lld             0x000055ac3b909282 llvm::VirtRegMap::createSpillSlot(llvm::TargetRegisterClass const*) + 258
9  lld             0x000055ac3b909694 llvm::VirtRegMap::assignVirt2StackSlot(llvm::Register) + 132
10 lld             0x000055ac3ba16b5d
11 lld             0x000055ac3b800000
12 lld             0x000055ac3b80078e
13 lld             0x000055ac3bae1325 llvm::RegAllocBase::allocatePhysRegs() + 325
14 lld             0x000055ac3b7fa7f8
15 lld             0x000055ac3b69a97e
16 lld             0x000055ac3d16d7f1 llvm::FPPassManager::runOnFunction(llvm::Function&) + 865
17 lld             0x000055ac3c822a77
18 lld             0x000055ac3d16e2b2 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 898
19 lld             0x000055ac3b3c0105
20 lld             0x000055ac3b3c06ad llvm::lto::backend(llvm::lto::Config const&, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) + 701
21 lld             0x000055ac3b3b3c6d llvm::lto::LTO::runRegularLTO(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>) + 2957
22 lld             0x000055ac3b3b431d llvm::lto::LTO::run(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>> (unsigned int, llvm::StringRef, llvm::Twine const&)>) + 781
23 lld             0x000055ac3a21eed8 lld::elf::BitcodeCompiler::compile() + 392
24 lld             0x000055ac3a18ae0f void lld::elf::LinkerDriver::compileBitcodeFiles<llvm::object::ELFType<(llvm::support::endianness)1, true>>(bool) + 207
25 lld             0x000055ac3a1a44bf lld::elf::LinkerDriver::link(llvm::opt::InputArgList&) + 7071
26 lld             0x000055ac3a1a6d06 lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) + 4550
27 lld             0x000055ac3a1a854a lld::elf::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) + 4714
28 lld             0x000055ac39fbfd11 lld_main(int, char**, llvm::ToolContext const&) + 417
29 lld             0x000055ac39f1cdf5 main + 53
30 libc.so.6       0x000014e0ce90a083 __libc_start_main + 243
31 lld             0x000055ac39fbd07e _start + 46
llvm-foreach: Aborted
lld: /opt/sycl/source/llvm/llvm/lib/CodeGen/MachineFrameInfo.cpp:55: int llvm::MachineFrameInfo::CreateStackObject(uint64_t, llvm::Align, bool, const llvm::AllocaInst*, uint8_t): Assertion `Size != 0 && "Cannot allocate zero size stack objects!"' failed.
PLEASE submit a bug report to https://github.com/intel/llvm/issues and include the crash backtrace.
Stack dump:
0.  Program arguments: /opt/sycl/dpcpp/bin/lld -flavor gnu -m elf64_amdgpu --no-undefined -shared -plugin-opt=-amdgpu-internalize-symbols -plugin-opt=mcpu=gfx90a -plugin-opt=O3 --lto-CGO3 -o /tmp/main-gfx90a-868a2d-67b1c7.out /tmp/main-gfx90a-89311f-a6d4f0.o
1.  Running pass 'CallGraph Pass Manager' on module 'ld-temp.o'.
2.  Running pass 'Greedy Register Allocator' on function '@_ZTSZZN4AdvX14ReverseIndexesclERN4sycl3_V15queueERNS2_6bufferIdLi3ENS2_6detail17aligned_allocatorIdEEvEERK9ADVParamsENKUlRNS2_7handlerEE_clESF_EUlNS2_5groupILi3EEEE__with_offset'
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  lld             0x000055a2f431ee6f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 63
1  lld             0x000055a2f431c5c4
2  libpthread.so.0 0x00001467eb6af420
3  libc.so.6       0x00001467eb14c00b gsignal + 203
4  libc.so.6       0x00001467eb12b859 abort + 299
5  libc.so.6       0x00001467eb12b729
6  libc.so.6       0x00001467eb13cfd6
7  lld             0x000055a2f595a2cc llvm::MachineFrameInfo::CreateSpillStackObject(unsigned long, llvm::Align) + 1436
8  lld             0x000055a2f5bd8282 llvm::VirtRegMap::createSpillSlot(llvm::TargetRegisterClass const*) + 258
9  lld             0x000055a2f5bd8694 llvm::VirtRegMap::assignVirt2StackSlot(llvm::Register) + 132
10 lld             0x000055a2f5ce5b5d
11 lld             0x000055a2f5acf000
12 lld             0x000055a2f5acf78e
13 lld             0x000055a2f5db0325 llvm::RegAllocBase::allocatePhysRegs() + 325
14 lld             0x000055a2f5ac97f8
15 lld             0x000055a2f596997e
16 lld             0x000055a2f743c7f1 llvm::FPPassManager::runOnFunction(llvm::Function&) + 865
17 lld             0x000055a2f6af1a77
18 lld             0x000055a2f743d2b2 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 898
19 lld             0x000055a2f568f105
20 lld             0x000055a2f568f6ad llvm::lto::backend(llvm::lto::Config const&, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) + 701
21 lld             0x000055a2f5682c6d llvm::lto::LTO::runRegularLTO(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>) + 2957
22 lld             0x000055a2f568331d llvm::lto::LTO::run(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>> (unsigned int, llvm::StringRef, llvm::Twine const&)>) + 781
23 lld             0x000055a2f44eded8 lld::elf::BitcodeCompiler::compile() + 392
24 lld             0x000055a2f4459e0f void lld::elf::LinkerDriver::compileBitcodeFiles<llvm::object::ELFType<(llvm::support::endianness)1, true>>(bool) + 207
25 lld             0x000055a2f44734bf lld::elf::LinkerDriver::link(llvm::opt::InputArgList&) + 7071
26 lld             0x000055a2f4475d06 lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) + 4550
27 lld             0x000055a2f447754a lld::elf::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) + 4714
28 lld             0x000055a2f428ed11 lld_main(int, char**, llvm::ToolContext const&) + 417
29 lld             0x000055a2f41ebdf5 main + 53
30 libc.so.6       0x00001467eb12d083 __libc_start_main + 243
31 lld             0x000055a2f428c07e _start + 46
llvm-foreach: Aborted
lld: /opt/sycl/source/llvm/llvm/lib/CodeGen/MachineFrameInfo.cpp:55: int llvm::MachineFrameInfo::CreateStackObject(uint64_t, llvm::Align, bool, const llvm::AllocaInst*, uint8_t): Assertion `Size != 0 && "Cannot allocate zero size stack objects!"' failed.
PLEASE submit a bug report to https://github.com/intel/llvm/issues and include the crash backtrace.
Stack dump:
0.  Program arguments: /opt/sycl/dpcpp/bin/lld -flavor gnu -m elf64_amdgpu --no-undefined -shared -plugin-opt=-amdgpu-internalize-symbols -plugin-opt=mcpu=gfx90a -plugin-opt=O3 --lto-CGO3 -o /tmp/main-gfx90a-868a2d-c72b79.out /tmp/main-gfx90a-89311f-a72631.o
1.  Running pass 'CallGraph Pass Manager' on module 'ld-temp.o'.
2.  Running pass 'Greedy Register Allocator' on function '@_ZTSZZN4AdvX16ReducedPrecisionclERN4sycl3_V15queueERNS2_6bufferIdLi3ENS2_6detail17aligned_allocatorIdEEvEERK9ADVParamsENKUlRNS2_7handlerEE_clESF_EUlNS2_5groupILi3EEEE__with_offset'
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  lld             0x000055792a65ee6f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 63
1  lld             0x000055792a65c5c4
2  libpthread.so.0 0x0000149cd527a420
3  libc.so.6       0x0000149cd4d1700b gsignal + 203
4  libc.so.6       0x0000149cd4cf6859 abort + 299
5  libc.so.6       0x0000149cd4cf6729
6  libc.so.6       0x0000149cd4d07fd6
7  lld             0x000055792bc9a2cc llvm::MachineFrameInfo::CreateSpillStackObject(unsigned long, llvm::Align) + 1436
8  lld             0x000055792bf18282 llvm::VirtRegMap::createSpillSlot(llvm::TargetRegisterClass const*) + 258
9  lld             0x000055792bf18694 llvm::VirtRegMap::assignVirt2StackSlot(llvm::Register) + 132
10 lld             0x000055792c025b5d
11 lld             0x000055792be0f000
12 lld             0x000055792be0f78e
13 lld             0x000055792c0f0325 llvm::RegAllocBase::allocatePhysRegs() + 325
14 lld             0x000055792be097f8
15 lld             0x000055792bca997e
16 lld             0x000055792d77c7f1 llvm::FPPassManager::runOnFunction(llvm::Function&) + 865
17 lld             0x000055792ce31a77
18 lld             0x000055792d77d2b2 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 898
19 lld             0x000055792b9cf105
20 lld             0x000055792b9cf6ad llvm::lto::backend(llvm::lto::Config const&, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) + 701
21 lld             0x000055792b9c2c6d llvm::lto::LTO::runRegularLTO(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>) + 2957
22 lld             0x000055792b9c331d llvm::lto::LTO::run(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>> (unsigned int, llvm::StringRef, llvm::Twine const&)>) + 781
23 lld             0x000055792a82ded8 lld::elf::BitcodeCompiler::compile() + 392
24 lld             0x000055792a799e0f void lld::elf::LinkerDriver::compileBitcodeFiles<llvm::object::ELFType<(llvm::support::endianness)1, true>>(bool) + 207
25 lld             0x000055792a7b34bf lld::elf::LinkerDriver::link(llvm::opt::InputArgList&) + 7071
26 lld             0x000055792a7b5d06 lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) + 4550
27 lld             0x000055792a7b754a lld::elf::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) + 4714
28 lld             0x000055792a5ced11 lld_main(int, char**, llvm::ToolContext const&) + 417
29 lld             0x000055792a52bdf5 main + 53
30 libc.so.6       0x0000149cd4cf8083 __libc_start_main + 243
31 lld             0x000055792a5cc07e _start + 46
llvm-foreach: Aborted
lld: /opt/sycl/source/llvm/llvm/lib/CodeGen/MachineFrameInfo.cpp:55: int llvm::MachineFrameInfo::CreateStackObject(uint64_t, llvm::Align, bool, const llvm::AllocaInst*, uint8_t): Assertion `Size != 0 && "Cannot allocate zero size stack objects!"' failed.
PLEASE submit a bug report to https://github.com/intel/llvm/issues and include the crash backtrace.
Stack dump:
0.  Program arguments: /opt/sycl/dpcpp/bin/lld -flavor gnu -m elf64_amdgpu --no-undefined -shared -plugin-opt=-amdgpu-internalize-symbols -plugin-opt=mcpu=gfx90a -plugin-opt=O3 --lto-CGO3 -o /tmp/main-gfx90a-868a2d-85eb67.out /tmp/main-gfx90a-89311f-c2f3bc.o
1.  Running pass 'CallGraph Pass Manager' on module 'ld-temp.o'.
2.  Running pass 'Greedy Register Allocator' on function '@_ZTSZZN4AdvX12HierarchicalclERN4sycl3_V15queueERNS2_6bufferIdLi3ENS2_6detail17aligned_allocatorIdEEvEERK9ADVParamsENKUlRNS2_7handlerEE_clESF_EUlNS2_5groupILi3EEEE__with_offset'
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  lld             0x00005591b6111e6f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 63
1  lld             0x00005591b610f5c4
2  libpthread.so.0 0x00001472e776f420
3  libc.so.6       0x00001472e720c00b gsignal + 203
4  libc.so.6       0x00001472e71eb859 abort + 299
5  libc.so.6       0x00001472e71eb729
6  libc.so.6       0x00001472e71fcfd6
7  lld             0x00005591b774d2cc llvm::MachineFrameInfo::CreateSpillStackObject(unsigned long, llvm::Align) + 1436
8  lld             0x00005591b79cb282 llvm::VirtRegMap::createSpillSlot(llvm::TargetRegisterClass const*) + 258
9  lld             0x00005591b79cb694 llvm::VirtRegMap::assignVirt2StackSlot(llvm::Register) + 132
10 lld             0x00005591b7ad8b5d
11 lld             0x00005591b78c2000
12 lld             0x00005591b78c278e
13 lld             0x00005591b7ba3325 llvm::RegAllocBase::allocatePhysRegs() + 325
14 lld             0x00005591b78bc7f8
15 lld             0x00005591b775c97e
16 lld             0x00005591b922f7f1 llvm::FPPassManager::runOnFunction(llvm::Function&) + 865
17 lld             0x00005591b88e4a77
18 lld             0x00005591b92302b2 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 898
19 lld             0x00005591b7482105
20 lld             0x00005591b74826ad llvm::lto::backend(llvm::lto::Config const&, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) + 701
21 lld             0x00005591b7475c6d llvm::lto::LTO::runRegularLTO(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>) + 2957
22 lld             0x00005591b747631d llvm::lto::LTO::run(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>> (unsigned int, llvm::StringRef, llvm::Twine const&)>) + 781
23 lld             0x00005591b62e0ed8 lld::elf::BitcodeCompiler::compile() + 392
24 lld             0x00005591b624ce0f void lld::elf::LinkerDriver::compileBitcodeFiles<llvm::object::ELFType<(llvm::support::endianness)1, true>>(bool) + 207
25 lld             0x00005591b62664bf lld::elf::LinkerDriver::link(llvm::opt::InputArgList&) + 7071
26 lld             0x00005591b6268d06 lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) + 4550
27 lld             0x00005591b626a54a lld::elf::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) + 4714
28 lld             0x00005591b6081d11 lld_main(int, char**, llvm::ToolContext const&) + 417
29 lld             0x00005591b5fdedf5 main + 53
30 libc.so.6       0x00001472e71ed083 __libc_start_main + 243
31 lld             0x00005591b607f07e _start + 46
llvm-foreach: Aborted
clang++: error: amdgcn-link command failed with exit code 254 (use -v to see invocation)
clang version 17.0.0 (https://github.com/intel/llvm.git 663042b04b63cb1830ae12b4442e12af4830764e)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/sycl/dpcpp/bin
clang++: warning: linked binaries do not contain expected 'amdgcn-amd-amdhsa-gfx90a-' target; found targets: 'amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-, amdgcn-amd-amdhsa--gfx90a-' [-Wsycl-target]
clang++: note: diagnostic msg: Error generating preprocessed source(s).
make[2]: *** [src/CMakeFiles/advection.dir/build.make:278: src/advection] Error 1
make[1]: *** [CMakeFiles/Makefile2:153: src/CMakeFiles/advection.dir/all] Error 2
make: *** [Makefile:91: all] Error 2

To reproduce

int main(){ sycl::range r(64,64,64); sycl::buffer<double, 3> buff(r);

sycl::queue q;
//Need at least a kernel invocation to be compiled for the target device
q.submit([&](sycl::handler &cgh){
    sycl::accessor acc(buff, cgh, sycl::write_only, sycl::no_init);
    cgh.parallel_for(r, [=](sycl::id<3> itm){
        acc[itm] = itm[0] + itm[1] + itm[2];
    });
}).wait();

return 0;

}


- Compile command:
`clang++ -fsycl -fsycl-targets=nvidia_gpu_sm_80 -c test.cpp -o test.o`
- Linking command:
`clang++ -fsycl -fsycl-targets=nvidia_gpu_sm_80 test.o -o a.out`

- The warning:
`clang++: warning: linked binaries do not contain expected 'nvptx64-nvidia-cuda-sm_80-' target; found targets: 'nvptx64-nvidia-cuda--sm_80-' [-Wsycl-target]`, there is a slight difference of a single `-`

On this minimal example, the code does not seem to crash at execution, but in the real use case it does on NVIDIA GPU (not on AMD GPUs, see section "Describe the bug"). I've tried the same example with dpcpp commit `589824d` and I do not get the warning. 

### Environment

- OS: Ubuntu 20.04.6 LTS
- Target devices: NVIDIA A100-SXM4-40GB and AMD MI250x
- Commit: 2023-W7
- CUDA 11.8, ROCm 5.3.0 
- Output of `nvidia-smi` and `sycl-ls` on the NVIDIA GPU env:
```sh
NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2
...
NVIDIA A100-SXM4-40GB`
  Platform [#2]:
    Version  : CUDA 12.2
    Name     : NVIDIA CUDA BACKEND
    Vendor   : NVIDIA Corporation
    Devices  : 1
        Device [#0]:
        Type       : gpu
        Version    : 8.0
        Name       : NVIDIA A100-SXM4-40GB
        Vendor     : NVIDIA Corporation
        Driver     : CUDA 12.2
        Aspects    : gpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address usm_atomic_host_allocations usm_atomic_shared_allocations atomic64 ext_intel_device_info_uuid ext_oneapi_native_assert ext_oneapi_bfloat16_math_functions ext_intel_free_memory ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_widthur_print: Images are not fully supported by the CUDA BE, their support is disabled by default. Their partial support can be activated by setting SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT environment variable at runtime.
Platform [#2]:
    Version  : HIP 50322.6
    Name     : AMD HIP BACKEND
    Vendor   : AMD Corporation
    Devices  : 1
        Device [#0]:
        Type       : gpu
        Version    : gfx90a:sramecc+:xnack-
        Name       : AMD Radeon Graphics
        Vendor     : AMD Corporation
        Driver     : HIP 50322.6
        Aspects    : gpu fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address usm_atomic_host_allocations usm_atomic_shared_allocations atomic64 ext_intel_device_info_uuid ext_oneapi_native_assert ext_intel_free_memory ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image
        info::device::sub_group_sizes: 64

Additional context

I am using this sycl container.

npmiller commented 4 months ago

Hello,

As I understand it the 2023-WW27 release is just for OpenCL RT for Intel CPU and FPGA emulator, not a general DPC++ release.

I believe the specific issue you mention has been fixed a while ago, you could try one of the daily package instead like nightly-2024-05-21, or the Intel oneAPI release along with the Codeplay plugins for Nvidia and AMD.

Aympab commented 4 months ago

I am having trouble building the project on the daily release, although I'm using the same env and command as I used to. I get an error about a boost/mp11 include which stops me from building the project. I'll update you on this issue when I manage to solve this.

AvailableCXGuo commented 3 months ago

I am having trouble building the project on the daily release, although I'm using the same env and command as I used to. I get an error about a boost/mp11 include which stops me from building the project. I'll update you on this issue when I manage to solve this.

Hi there, have you solved the boost/mp11 trouble? I got the error of missing "sycl/detail/boost/mp11/algorithms.hpp".

Aympab commented 3 months ago

Hi, I didn't manage to work around the error. For now I'm just using an older version of dpc++, but eventually I'll have to update it. Let me know if you get any information about this, maybe we should create an issue