Open JonChesterfield opened 4 years ago
@llvm/issue-subscribers-backend-amdgpu
Author: Jon Chesterfield (JonChesterfield)
leak_addr
was not emitted because by default it is a host variable. Add __device__
to it will cause it emitted
https://godbolt.org/z/T3nMPKEMG
The explicit instantiation has internal linkage in the default -fno-gpu-rdc
mode since there is only one TU. If it is not used in the same TU, internalization will remove it. If you want to keep it so that it may be used by other TU, add -fgpu-rdc
, then it has linkonce_odr linkage and will be kept.
Extended Description
A template instantiation in hip code compiled with cuda-device-only is missing from the generated bitcode.
Component is HIP, but there is no HIP on the drop down list so marking as C++.
https://godbolt.org/z/GB9MY9 clang atomic.hip --cuda-device-only -nogpulib --cuda-gpu-arch=gfx906 -emit-llvm
Results in IR target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-ni:7" target triple = "amdgcn-amd-amdhsa" define hidden i32 @_Z14atomic_inc_intPi(i32 nocapture %0) local_unnamed_addr #0 { %2 = atomicrmw add i32 %0, i32 1 seq_cst ret i32 %2 }
attributes #0 = { ... }
!llvm.module.flags = !{#0, !1} !llvm.ident = !{#2}
!0 = !{i32 1, !"wchar_size", i32 4} !1 = !{i32 7, !"PIC Level", i32 1} !2 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 6185dc0eb3ad35e1f85f2ab1038ca978563099f4)"}
Also raised on a downstream tracker before realising it reproduces on trunk (https://github.com/ROCm-Developer-Tools/aomp/issues/67)