llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.37k stars 12.14k forks source link

[LTO] Dynamic initializer incorrectly optimized out #110232

Open mstorsjo opened 2 months ago

mstorsjo commented 2 months ago

Since 601645c3b70e2a17d18779a3a51b8bc9ecdc9aa6, CC @rnk, the following testcase fails:

main.cpp:

int DoIt() { return 0; }
inline int Unused0 = DoIt();
int GetUnused();
int main(int, char **) { return GetUnused(); }

lib.cpp:

int DoIt();
inline int Unused0 = DoIt();
int GetUnused() { return Unused0; }

stub.c:

void __cxa_guard_acquire(void) {}
void __cxa_guard_release(void) {}
void __gxx_personality_seh0(void) {}
void __cxa_guard_abort(void) {}
void _Unwind_Resume(void) {}
void __main(void) {}

To reproduce:

$ clang -target x86_64-w64-mingw32 -c stub.c
$ clang -target x86_64-w64-mingw32 -c main.cpp -mguard=cf -flto=thin
$ clang -target x86_64-w64-mingw32 -c lib.cpp -mguard=cf -flto=thin
$ lld-link lib.o main.o stub.o -out:out.exe -entry:main
lld-link: error: undefined symbol: __cxx_global_var_init
>>> referenced by out.exe.lto.lib.obj

The reason for this is visible, if we inspect the output of the LTO compile:

$ lld-link lib.o main.o stub.o -out:out.exe -entry:main -lldemit:asm
$ cat out.exe.lto.main.s | grep -A1 -B1 __cxx_global_var_init
    .section    .gfids$y,"dr"
    .symidx __cxx_global_var_init
    .section    .giats$y,"dr"
--
    .addrsig
    .addrsig_sym __cxx_global_var_init
    .addrsig_sym Unused0

So due to the -mguard=cf, we end up keeping a reference to __cxx_global_var_init in the .gfids$y section - but the LTO optimization itself has decided to optimize out the __cxx_global_var_init function entirely.

It seems like the exact ordering of object files is crucial here; if I swap the order of lib.o and main.o in this example, then out.exe.lto.main.s does retain the __cxx_global_var_init function, but out.exe.lto.lto.s has lost the function instead.

This was originally reported downstream at https://github.com/mstorsjo/llvm-mingw/issues/457.

Debugging what really happens during the LTO compilation is somewhat finicky; if I grab the LTO IR with -lldemit:llvm -out:lto.bc and try to compile that with llc, I get different results as well. (In that case, I end up with an undefined GetUnused() function.)

As the referenced commit 601645c3b70e2a17d18779a3a51b8bc9ecdc9aa6 was NFCI, no functional changes intended, but we do have functional changes here, I think it would be reasonable to revert the commit.

mstorsjo commented 2 months ago

I'll go ahead and revert https://github.com/llvm/llvm-project/commit/601645c3b70e2a17d18779a3a51b8bc9ecdc9aa6, as it indeed wasn't NFC as intended.

mstorsjo commented 2 months ago

Reverted in 0cf4cb4bde440586c310554d93dc09e47cb9bb79.

rnk commented 2 months ago

Right, this should be "has comdats, but is not COFF", and that covers mingw. I think the predicate this code wants is more like "are comdat groups supported". COFF supports associativity, which is different, but we model it with groups because I thought that was a good idea ~9 years ago.