llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.25k stars 12.08k forks source link

Support split LTO units with Unified LTO #77524

Open ilovepi opened 10 months ago

ilovepi commented 10 months ago

When investigating https://github.com/llvm/llvm-project/issues/70703, we discovered that Unified LTO doesn't work w/ split LTO units.

To quote from that issue:

If you run Unified LTO with split LTO units enabled, empty module IDs mean that some modules will be written as regular LTO modules, and some will be written as ThinLTO modules. Unified LTO wants all modules to be either ThinLTO or regular LTO modules so that the whole program can be optimized as one piece.

@ormris posted https://reviews.llvm.org/D123969 to address the module ID problem, however, there were some concerns about correctness (see https://reviews.llvm.org/D123969#3460023).

I think Unified LTO should be able to work with split LTO units, regardless of the Module Id issue. Using the same format and pre-link optimization pipeline shouldn't prevent orthogonal LTO features from working (at least from what I can tell). If I'm wrong, and it is fundamentally incompatible then we need to document that, and take a look at both WPD and CFI implementations to determine if they can be made to work w/o split LTO units.

I'm not exactly clear on what we'd need to do to make Unified LTO work as expected and support important optimizations like WPD and security hardening measures like CFI.

I'm CCing folks from the previous discussion. CC: @mandlebug @ormris @nikic @petrhosek @teresajohnson

llvmbot commented 10 months ago

@llvm/issue-subscribers-bug

Author: Paul Kirth (ilovepi)

When investigating https://github.com/llvm/llvm-project/issues/70703, we discovered that Unified LTO doesn't work w/ split LTO units. To quote from that issue: ``` If you run Unified LTO with split LTO units enabled, empty module IDs mean that some modules will be written as regular LTO modules, and some will be written as ThinLTO modules. Unified LTO wants all modules to be either ThinLTO or regular LTO modules so that the whole program can be optimized as one piece. ``` @ormris posted https://reviews.llvm.org/D123969 to address the module ID problem, however, there were some concerns about correctness (see https://reviews.llvm.org/D123969#3460023). I think Unified LTO should be able to work with split LTO units, regardless of the Module Id issue. Using the same format and pre-link optimization pipeline shouldn't prevent orthogonal LTO features from working (at least from what I can tell). If I'm wrong, and it is fundamentally incompatible then we need to document that, and take a look at both WPD and CFI implementations to determine if they can be made to work w/o split LTO units. I'm not exactly clear on what we'd need to do to make Unified LTO work as expected and support important optimizations like WPD and security hardening measures like CFI. I'm CCing folks from the previous discussion. CC: @mandlebug @ormris @nikic @petrhosek @teresajohnson
nikic commented 10 months ago

IIRC the assertion failure also happens when using FatLTO without split LTO units, or at least, without using any features that would require them. You get the assertions in a plain build of clang (just -flto=thin -ffat-lto-objects, no other options), which shouldn't use split LTO units, right?