intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Other
1.22k stars 732 forks source link

[CUDA] ptxas: unresolved extern function __spirv_AtomicLoad(...) #1111

Closed fwyzard closed 4 years ago

fwyzard commented 4 years ago

Hi, following the blog post and the latest instructions from Codeplay here and here, I am trying to use the cuda branch to compile a simple SYCL/oneAPI application for the NVPTX backend.

Minimal background: this stand-alone application is a very small part of the CMS reconstruction software that we have ported to run on GPUs, and are using to evaluate different performance portability libraries and frameworks.

I have built the toolchain with

SYCL_HOME=$PWD
git clone https://github.com/codeplaysoftware/sycl-for-cuda -b cuda llvm

mkdir build
cd build
cmake $SYCL_HOME/llvm/llvm \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_TARGETS_TO_BUILD="X86;PowerPC;AArch64;NVPTX" \
  -DLLVM_EXTERNAL_PROJECTS="llvm-spirv;sycl" \
  -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;compiler-rt;lld;openmp;llvm-spirv;sycl;libclc" \
  -DLLVM_EXTERNAL_SYCL_SOURCE_DIR=$SYCL_HOME/llvm/sycl \
  -DLLVM_EXTERNAL_LLVM_SPIRV_SOURCE_DIR=$SYCL_HOME/llvm/llvm-spirv \
  -DLLVM_ENABLE_EH=ON \
  -DLLVM_ENABLE_PIC=ON \
  -DLLVM_ENABLE_RTTI=ON \
  -DSYCL_BUILD_PI_CUDA=ON \
  -DLIBCLC_TARGETS_TO_BUILD="nvptx64--;nvptx64--nvidiacl" \
  -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-10.2

make -j`nproc` sycl-toolchain

The build seems successful, and i can use it to compile my test program for the SPIR-V backend:

cd ..
git clone git@github.com:fwyzard/pixel-standalone.git -b oneapi test
cd test
$SYCL_HOME/build/bin/clang++ -fsycl -I/opt/intel/inteloneapi/dpcpp-ct/latest/include -O2 -std=c++14 -DDIGI_ONEAPI -o test-oneapi main_oneapi.cc rawtodigi_oneapi.cc

However, building for the NVPTX backend with

$SYCL_HOME/build/bin/clang++ -fsycl -I/opt/intel/inteloneapi/dpcpp-ct/latest/include -fsycl-targets=nvptx64-nvidia-cuda-sycldevice --cuda-path=/usr/local/cuda -O2 -std=c++14 -DDIGI_ONEAPI -o test-oneapi-cuda main_oneapi.cc rawtodigi_oneapi.cc

fails with

ptxas fatal   : Unresolved extern function '_Z18__spirv_AtomicLoadPU3AS1KjN5__spv5ScopeENS1_19MemorySemanticsMaskE'
clang-11: error: ptxas command failed with exit code 255 (use -v to see invocation)

Is this something that is not supported yet (possibly due to the use of the Intel® DPC++ Compatibility Tool header) ?

Or am I missing something while building the toolchain ?

Thank you, .Andrea

fwyzard commented 4 years ago

@Alexander-Johnston FYI

Ruyk commented 4 years ago

This is a (known) missing mapping to a PTX builtin, it will come up soon after we manage to merge https://github.com/intel/llvm/pull/1091

steffenlarsen commented 4 years ago

Implementation of the mappings: https://github.com/codeplaysoftware/sycl-for-cuda/pull/14.

steffenlarsen commented 4 years ago

The PR for fixing this has been retargetted: https://github.com/intel/llvm/pull/1173