As can be seen, despite the fact that "FOO" is initialized with a constant value of 23, this initialization does not happen at compile-time or at least at run-time during program startup. Instead there is a flag ("_ZGVZ3barvE3FOO") that is checked each time is called to check if "FOO" has already been initialized.
(With thread-safe statics there is additional code to make sure the flag is checked and set in a thread-safe manner.)
When marking "foo" as "constexpr", the generated code is as expected:
define i32 @_Z3barv() #0 {
entry:
ret i32 23
}
However changing the function to "constexpr" is not always a possible solution, e.g. when using external headers such as xmmintrin.h:
include
float baz()
{
static const __m128 one = _mm_set_ss(1.0f);
Extended Description
Take the following program fragment:
static int foo() { return 23; }
int bar() { static const int FOO = foo();
return FOO; }
When compiled with clang -r205174, -O3 and without thread-safe statics:
% ~/LLVM/build/Release+Asserts/bin/clang++ -S -emit-llvm -O3 -fno-threadsafe-statics clang.cpp
It is compiled to this (full output attached):
define i32 @_Z3barv() #0 { entry: %.b = load i1* @_ZGVZ3barvE3FOO, align 1 br i1 %.b, label %init.end, label %init.check
init.check: ; preds = %entry store i32 23, i32 @_ZZ3barvE3FOO, align 4, !tbaa !1 %0 = tail call {} @llvm.invariant.start(i64 4, i8 bitcast (i32 @_ZZ3barvE3FOO to i8)) store i1 true, i1 @_ZGVZ3barvE3FOO, align 1 br label %init.end
init.end: ; preds = %entry, %init.check %1 = load i32* @_ZZ3barvE3FOO, align 4, !tbaa !1 ret i32 %1 }
As can be seen, despite the fact that "FOO" is initialized with a constant value of 23, this initialization does not happen at compile-time or at least at run-time during program startup. Instead there is a flag ("_ZGVZ3barvE3FOO") that is checked each time is called to check if "FOO" has already been initialized.
(With thread-safe statics there is additional code to make sure the flag is checked and set in a thread-safe manner.)
When marking "foo" as "constexpr", the generated code is as expected:
define i32 @_Z3barv() #0 { entry: ret i32 23 }
However changing the function to "constexpr" is not always a possible solution, e.g. when using external headers such as xmmintrin.h:
include
float baz() { static const __m128 one = _mm_set_ss(1.0f);
return _mm_cvtss_f32(one); }
Results in:
define float @_Z3bazv() #0 { entry: %.b = load i1* @_ZGVZ3bazvE3one, align 1 br i1 %.b, label %init.end, label %init.check
init.check: ; preds = %entry store <4 x float> <float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00>, <4 x float> @_ZZ3bazvE3one, align 16, !tbaa !1 %0 = tail call {} @llvm.invariant.start(i64 16, i8 bitcast (<4 x float> @_ZZ3bazvE3one to i8)) store i1 true, i1 @_ZGVZ3bazvE3one, align 1 br label %init.end
init.end: ; preds = %entry, %init.check %1 = load <4 x float>* @_ZZ3bazvE3one, align 16, !tbaa !1 %vecext.i = extractelement <4 x float> %1, i32 0 ret float %vecext.i }
However, changing the code to the equivalent (essentially inlining _mm_set_ss):
float baz() { static const __m128 one = { 1.0f, 0.0f, 0.0f, 0.0f };
return _mm_cvtss_f32(one); }
Results in the expected:
define float @_Z3bazv() #0 { entry: ret float 1.000000e+00 }