Reuse LTO products for incremental builds when deps are unchanged

rust-lang / rust

Empowering everyone to build reliable and efficient software.

https://www.rust-lang.org

Other

98.95k stars 12.79k forks source link

Reuse LTO products for incremental builds when deps are unchanged #71850

Open fenollp opened 4 years ago

fenollp commented 4 years ago

Incremental builds with LTO (thin or fat) always take the same amount of time (~10s on my project) when dependencies are unchanged. It seems to take as long when adding a dependency.

Is it not possible to cache (at least part) of the LTO computations?

Note I'm using cross:

# cross version
cargo 1.44.0-nightly (8751eb301 2020-04-21)

Note also this bug I've encountered WRT LTO: https://github.com/rust-embedded/cross/issues/416

https://github.com/rust-lang/rust/issues/71248 is the most recent issue I could find that seems related.

pnkfelix commented 4 years ago

To be clear: You're describing a scenario where your current crate is changed, but the dependencies are unchanged, right?

fenollp commented 4 years ago

@pnkfelix Yep: changing some of the crate's code while deps are unchanged (LTO setting unchanged) takes a while to compile (incrementally).

bjorn3 commented 1 year ago

For fat LTO caching is not possible. Fat LTO basically works by merging bitcode for all codegen units into a single llvm module and then optimizing. For ThinLTO afaik we already reuses all optimized codegen units which LLVM tells us can be reused: https://github.com/rust-lang/rust/blob/33a2c2487ac5d9927830ea4c1844335c6b9f77db/compiler/rustc_codegen_llvm/src/back/lto.rs#L549-L558

fenollp commented 1 year ago

Thanks for your clear and documented answer!

Fat LTO basically works by merging bitcode for all codegen units into a single llvm module and then optimizing.

I see how caching is not possible. Then how about changing how FatLTO is done? Instead of merging+optimizing all in one step, shouldn't we be doing this with various subsets of the crates tree? I'm thinking walking the dependency tree from the bottom up, this way parts of the tree that weren't touched can be identified and reused.

bjorn3 commented 1 year ago

Then how about changing how FatLTO is done? Instead of merging+optimizing all in one step, shouldn't we be doing this with various subsets of the crates tree?

What is the benefit of that over ThinLTO?

this way parts of the tree that weren't touched can be identified and reused.

You can still have optimizations that are affected by parts of the call graph that did change. That is the point of LTO.