Bugzilla Link	19290
Version	trunk
OS	All
Attachments	Complete bitcode output by clang for first program fragment
Reporter	LLVM Bugzilla Contributor
CC	@hfinkel,@zygoloid,@rnk

Extended Description

Take the following program fragment:

static int foo() { return 23; }

int bar() { static const int FOO = foo();

return FOO; }

When compiled with clang -r205174, -O3 and without thread-safe statics:

% ~/LLVM/build/Release+Asserts/bin/clang++ -S -emit-llvm -O3 -fno-threadsafe-statics clang.cpp

It is compiled to this (full output attached):

define i32 @_Z3barv() #0 { entry: %.b = load i1* @_ZGVZ3barvE3FOO, align 1 br i1 %.b, label %init.end, label %init.check

init.check: ; preds = %entry store i32 23, i32 @_ZZ3barvE3FOO, align 4, !tbaa !1 %0 = tail call {} @llvm.invariant.start(i64 4, i8 bitcast (i32 @_ZZ3barvE3FOO to i8)) store i1 true, i1 @_ZGVZ3barvE3FOO, align 1 br label %init.end

init.end: ; preds = %entry, %init.check %1 = load i32* @_ZZ3barvE3FOO, align 4, !tbaa !1 ret i32 %1 }

As can be seen, despite the fact that "FOO" is initialized with a constant value of 23, this initialization does not happen at compile-time or at least at run-time during program startup. Instead there is a flag ("_ZGVZ3barvE3FOO") that is checked each time is called to check if "FOO" has already been initialized.

(With thread-safe statics there is additional code to make sure the flag is checked and set in a thread-safe manner.)

When marking "foo" as "constexpr", the generated code is as expected:

define i32 @_Z3barv() #0 { entry: ret i32 23 }

However changing the function to "constexpr" is not always a possible solution, e.g. when using external headers such as xmmintrin.h:

include

float baz() { static const __m128 one = _mm_set_ss(1.0f);

return _mm_cvtss_f32(one); }

Results in:

define float @_Z3bazv() #0 { entry: %.b = load i1* @_ZGVZ3bazvE3one, align 1 br i1 %.b, label %init.end, label %init.check

init.check: ; preds = %entry store <4 x float> <float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00>, <4 x float> @_ZZ3bazvE3one, align 16, !tbaa !1 %0 = tail call {} @llvm.invariant.start(i64 16, i8 bitcast (<4 x float> @_ZZ3bazvE3one to i8)) store i1 true, i1 @_ZGVZ3bazvE3one, align 1 br label %init.end

init.end: ; preds = %entry, %init.check %1 = load <4 x float>* @_ZZ3bazvE3one, align 16, !tbaa !1 %vecext.i = extractelement <4 x float> %1, i32 0 ret float %vecext.i }

However, changing the code to the equivalent (essentially inlining _mm_set_ss):

float baz() { static const __m128 one = { 1.0f, 0.0f, 0.0f, 0.0f };

return _mm_cvtss_f32(one); }

Results in the expected:

define float @_Z3bazv() #0 { entry: ret float 1.000000e+00 }

llvm / llvm-project

Missed optimization with "static const" local variable initialized with function #19664

Extended Description

include