Open llvmbot opened 8 years ago
Further to Mehdi's point: if someone compiles one file with -ffreestanding, and another without, then that other file has an implicit dependency on the target library. Since the target library is available in one translation unit, can't we assume it's available for the entire linkage unit?
But what if you build your own libc and then link statically against it? The libc is likely built with either -ffreestanding
, -fno-builtin
or combination of -fno-builtin-*
, but the main application doesn't have these options as libc is actually available. And for embedded or performance-critical applications you may want the whole thing to be LT-Optimized.
Withdrawing the part of the update about the workaround not working: that's just because we started to use lld, and the workaround had a gold-specific flag.
FWIW, this is still a problem. More over, the workaround I used (running ThinLTO with -O0
) no longer works. Something on the LLVM side just got smarter, I guess.
In that case I'm fine with some sort of module flag (where any -fno-freestanding
implies -fno-freestanding
). Probably deserves an llvm-dev post first, in case we've all missed something though.
Duncan: that's a good point. I also agree with your reasoning.
I subscribe to Duncan reasoning, this seems in line with the correctness issue I mentioned above. I also "solves" quite nicely libraries distribution.
We could use a similar principle to support other target library features (which also don't survive LTO right now), like vector acceleration libraries. If one translation unit has an acceleration library available, then we can implicitly make it available for the whole linkage unit.
Further to Mehdi's point: if someone compiles one file with -ffreestanding
, and another without, then that other file has an implicit dependency on the target library. Since the target library is available in one translation unit, can't we assume it's available for the entire linkage unit?
Put another way: it's already true that someone compiling for a freestanding environment needs to ensure that every translation unit is compiled with -ffreestanding
. Can we take advantage of that to avoid pessimizing?
Not a semantic change, but it could be a major pessimization. TBH, I'm not sure how common this is (I could see it in libclang_rt
, maybe?), but here's what I'm concerned about:
-ffreestanding
users and others.-ffreestanding -flto
.-ffreestanding
infects the others, which could include big applications.Where things gets murky is that the one bitcode that was not compiled originally with -ffreestanding
may already have run loop-idiom recognize and generated a call to memset
.
This will lead into a link error for undefined symbols. We could equally require a consistent flag across files.
Compile as if with -ffreestanding.
I believe this would not be a semantics change for the non-freestanding part, and David Majnemer agrees.
What should be the behaviour if one file is compiled with -ffreestanding
, and another with -fno-freestanding
?
I had a brief look at what it would take to have a per-function TLI, and I'm not convinced that it would be worth the effort.
Here are all the call sites we'd need to thread a function through to: http://llvm-cs.pcc.me.uk/include/llvm/Analysis/TargetLibraryInfo.h/rhas
There are a few which appear to have no function context, e.g. http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/GlobalOpt.cpp#2407
That combined with Mehdi's point would seem to convince me that per-module is correct in this case.
Freestanding is supposed to be a property of the final program though, isn't it?
Also the TLI is set up for the pipeline, and not reset per function I believe. I don't know what it'd take to make it behave per-function.
Can we make this a function attribute instead? This would allow us to have different behaviour if some files are compiled with -ffreestanding
and others aren't... matching the non-LTO behaviour.
(summarising IRC discussion with Mehdi)
The -fno-builtin
flag as currently implemented has the wrong semantics. It should not affect whether the compiler may use standard library functions, but rather whether builtin functions receive builtin semantics. See GCC documentation here:
https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html
The flag that this program should be using to avoid the call to memset
is -ffreestanding
.
A correct implementation of -fno-builtin
would affect only the nobuiltin
attribute, so it would not need a separate serialization. We'd still need to serialize -ffreestanding
though so it survives to the TLI in the LTO backend.
The serialization I propose is a module flag that describes whether clang disables all libcalls in TLI, i.e. SimplfyLibCalls
(http://llvm-cs.pcc.me.uk/tools/clang/lib/CodeGen/BackendUtil.cpp#258 ). Currently this would encode (-fno-builtin || -ffreestanding
) but this could change to -ffreestanding
if we ever fix -fno-builtin
semantics in the future.
Yes, this is a optimization pass that isn't added on the regular LTO side (loop idiom recognizer).
This has nothing to do with the nobuiltin attribute. The information about which library functions are available is maintained by the TargetLibraryInfoImpl
class. Clang normally fills it in here: http://llvm-cs.pcc.me.uk/tools/clang/lib/CodeGen/BackendUtil.cpp#264
We need to somehow serialize/deserialize either TargetLibraryInfoImpl
or the -fno-builtin
flags so they survive to the ThinLTO backend.
Looking
assigned to @pcc
Extended Description
Consider the following program (also attached):
Let's build it w/o libc and with ThinLTO:
What happens here is the first command line forgets to attach
nobuiltin
attribute to_start
, and the second command line recognizesmemset
in the loop and then fails by not being able to find the symbol.With full LTO the command line succeeds, most likely (I have not looked closely) due to some optimization pass disabled and therefore not triggering the issue. Still, the blame is on the Clang frontend.