Open Quuxplusone opened 3 years ago
Bugzilla Link | PR52511 |
Status | NEW |
Importance | P enhancement |
Reported by | Jon Chesterfield (jonathanchesterfield@gmail.com) |
Reported on | 2021-11-15 08:20:02 -0800 |
Last modified on | 2021-11-15 08:49:25 -0800 |
Version | unspecified |
Hardware | PC Linux |
CC | jdoerfert@anl.gov, llvm-bugs@lists.llvm.org, Matthew.Arsenault@amd.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
Doesn't need libm headers to reproduce,
#pragma omp declare target
double func(double);
const double glob = func(4.2);
#pragma omp end declare target
clang++ -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa math_exp.cpp -o
math_exp -lm -save-temps
define internal void @__omp_offloading__10303_4068b_glob_l11_ctor() #0 {
entry:
%call = call double @_Z4funcd(double 4.200000e+00) #10
store double %call, double addrspace(4)* @_ZL4glob, align 8, !tbaa !8
ret void
}
Changing the invocation to c++ changes codegen,
clang++ -std=c++14 -Wall -Wextra -emit-llvm -O2 -ffreestanding -fno-exceptions -
-target=amdgcn-amd-amdhsa -march=gfx906 -mcpu=gfx906 -Xclang -fconvergent-
functions -nogpulib math_exp.cpp -O3 -c -save-temps
; cast away the addrspace before the store
define internal void @__cxx_global_var_init() #0 {
entry:
%call = call double @_Z4funcd(double 4.200000e+00) #3
store double %call, double* addrspacecast (double addrspace(4)* @_ZL4glob to double*), align 8, !tbaa !3
%0 = call {}* @llvm.invariant.start.p0i8(i64 8, i8* addrspacecast (i8 addrspace(4)* bitcast (double addrspace(4)* @_ZL4glob to i8 addrspace(4)*) to i8*))
ret void
}
This is then optimised to IR that discards the result entirely
define internal void @_GLOBAL__sub_I_math_exp.cpp() #2 {
entry:
%call.i = tail call double @_Z4funcd(double 4.200000e+00) #3
%0 = tail call {}* @llvm.invariant.start.p0i8(i64 8, i8* addrspacecast (i8 addrspace(4)* bitcast (double addrspace(4)* @_ZL4glob to i8 addrspace(4)*) to i8*)) #4
ret void
}
That is to say, C++ looks broken in a slightly different way to OpenMP C++.
Glibc / constexpr doesn't seem to be involved.