Open llvmbot opened 3 years ago
LLVM IR of failing atomic loads and stores While the problem appeared with SYCL (the implementation at https://github.com/intel/llvm), the problem is actually caused by the NVPTX backend that doesn't know how to select the atomic instructions. I'm attaching LLVM IR that shows (some of) the problematic instructions, I get the exact same error with $ llc -march=nvptx64
For C/C++ 'relaxed' aka LLVM's 'monotonic', I implemented basic support in August 2018: https://reviews.llvm.org/D50391 In that revision, I already noted that "Higher levels of atomicity (like acquire and release) need additional synchronization properties which were added with PTX ISA 6.0 / sm_70." For older hardware, libcu++ shows a (hopefully correct) mapping with memory barriers: https://github.com/NVIDIA/libcudacxx/blob/main/include/cuda/std/detail/__atomic_generated (the code is generated by https://github.com/NVIDIA/libcudacxx/blob/main/codegen/codegen.cpp)
LLVM backend for NVPTX might not know how to select the right atomic instruction from the ISA or the compiler might not handle the requested atomicity guarantees.
I am trying to compile and run the test4.cpp code on an NVIDIA card (GeForce RTX 2080 SUPER, Cuda architecture 75) - see instructions in the attached archive.
assigned to @Artem-B
Extended Description
If the queues[0]->enqueue(0); line from test4.cpp:22 is uncommented I get this weird compile error:
fatal error: error in backend: Cannot select: t11: i32,ch = AtomicLoad<(load seq_cst 4 from %ir._M_i.i.i.i.i.i)> t6:1, t10 t10: i64 = add nuw t6, Constant:i64<8> t6: i64,ch = load<(load 8 from %ir.arg, !tbaa !9, addrspace 1)> t0, t15, undef:i64 t15: i64,ch = load<(dereferenceable invariant load 8 from i64 addrspace(101)* null, addrspace 101)> t0, TargetExternalSymbol:i64'_ZTSZZ4mainENKUlRN2cl4sycl7handlerEE39_16clES2_EUlNS0_7nd_itemILi3EEEE40_63_param_0', undef:i64 t1: i64 = TargetExternalSymbol'_ZTSZZ4mainENKUlRN2cl4sycl7handlerEE39_16clES2_EUlNS0_7nd_itemILi3EEEE40_63_param_0' t3: i64 = undef t3: i64 = undef t9: i64 = Constant<8> In function: _ZTSZZ4mainENKUlRN2cl4sycl7handlerEE39_16clES2_EUlNS0_7nd_itemILi3EEEE40_63 clang-12: error: clang frontend command failed with exit code 70 (use -v to see invocation) clang version 13.0.0 (https://github.com/intel/llvm 73b7da0314703154d613d7883a3483468e7e461a) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /home/dadosaru/sycl_workspace/llvm/build/bin clang-12: note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs. CMakeFiles/test4.dir/build.make:102: recipe for target 'test4' failed make[3]: [test4] Error 70 CMakeFiles/Makefile2:129: recipe for target 'CMakeFiles/test4.dir/all' failed make[2]: [CMakeFiles/test4.dir/all] Error 2 CMakeFiles/Makefile2:136: recipe for target 'CMakeFiles/test4.dir/rule' failed make[1]: [CMakeFiles/test4.dir/rule] Error 2 Makefile:150: recipe for target 'test4' failed make: [test4] Error 2