Kobzol / cargo-remark

Cargo subcommand for viewing LLVM optimization remarks.
MIT License
160 stars 2 forks source link

Interactions with LTO? #17

Open VorpalBlade opened 4 months ago

VorpalBlade commented 4 months ago

I'm getting some confusing optimisation remarks, such as:

anyhow::error::<impl anyhow::Error>::msg will not be inlined into paketkoll_core::backend::arch::mtree::extract_mtree because its definition is unavailable

I would expect everything except core/std to be available since I'm building with fat LTO. From my project's Cargo.toml:

[profile.release]
lto = "fat"
opt-level = 2

Do I need to do anything extra to make cargo-remark report be based on the LTO build?

❯ cargo remark --version   
cargo-remark-remark 0.1.2
❯ rustc +nightly --version
rustc 1.80.0-nightly (7d83a4c13 2024-05-06)
❯ rustc --version         
rustc 1.78.0 (9b00956e5 2024-04-29)

Using Arch Linux, and the project is https://github.com/VorpalBlade/paketkoll

Kobzol commented 4 months ago

To be honest, I have no idea :) The standard library is not compiled when you build your code, it is precompiled. That being said, with LTO it uses some pre-built IR to make it work (IIRC), but I'm not sure if that applies to everything. Could you try building your program with -Zbuild-std (if you can get it to work)?

VorpalBlade commented 4 months ago

To be honest, I have no idea :) The standard library is not compiled when you build your code, it is precompiled

Right, which is why my example was from anyhow, not the standard library, I would have expected that to work. I'll try Zbuild-std tomorrow.

VorpalBlade commented 4 months ago

Could you try building your program with -Zbuild-std (if you can get it to work)?

For anyone else wondering, the command needed seems to be cargo +nightly remark build -- -Z build-std --target x86_64-unknown-linux-gnu. I have not tried to combine this with PGO (yet) as that is quite slow as it is.

Unfortunately this doesn't make a difference, as the remarks seem to not care about LTO in general at all, not just for std:

faster_hex::decode::hex_decode will not be inlined into paketkoll_core::backend::deb::parsers::parse_md5sums::{{closure}} because its definition is unavailable
core::panicking::panic_in_cleanup will not be inlined into paketkoll_core::types::issue::format_error because its definition is unavailable
std::env::current_dir will not be inlined into paketkoll_core::mtree::MTree<R>::from_reader
(and so on)

Not all those lines are from std/core.

Kobzol commented 4 months ago

It's possible that it's a bad diagnostic, but it's also possible that rustc's LTO doesn't do a perfect job and cannot in fact inline everything. I don't suppose that codegen-units = 1 helps? It only affects CGUs within a single crate, but maybe it could have some effect.

VorpalBlade commented 4 months ago

I tried codegen-units = 1, it made no difference from what I can tell.

Kobzol commented 4 months ago

I see. Then I would probably try to create an issue on rust-lang/rust, it might be a sign of some bad inlining decisions or configuration. Or maybe the remark is just wrong and the functions are not inlined because of a different reason :man_shrugging: Probably would have to be investigated within LLVM.