Open Kobzol opened 6 months ago
Small aside, but do you think that code using optimize_for_size
should be always optimized for size, or only in release mode? Since a lot of the submitted PRs use cfg!(feature = "optimize_for_size")
to test, and I'm not sure that dead code removal will always work in these cases without optimizations enabled.
Obviously, to get the best benefits, you're going to want optimizations anyway, but I'm thinking that we probably do want to ensure that code actually isn't compiled in all cases using conditional compilation, rather than just conditions. This also might help building on memory-constrained environments since the code won't have to be kept in memory before it's removed.
I also think that longer-term, there should be an easier flag to enable this feature that also takes into account optimization options (making sure loops aren't unrolled, etc.) but at least for now, while it's just a Cargo feature, it makes sense to at least figure out what cases we should optimize for.
I would be extremely surprised if any cfg-ed out code survived even in debug mode without optimizations. Even something like this:
fn main() {
#[cfg(feature = "foo")]
println!("hellorust");
}
Doesn't contain the hellorust
string in the final binary in a debug build when the feature is disabled.
in practice we've been using the cfg!(feature = "optimize_for_size")
macro a bunch, which would rely on the optimizer evaluating and simplifying boolean conditions and then removing branches or blocks. I suspect in practice this is reliable enough. Certainly on small examples in godbolt this seems to work fine.
More generally, the point of this flag is really --release
mode. The size of a debug binary is just not representative at all. Especially on embedded, --release
is effectively standard: you don't want to wait for your debug binary to be pushed through a wire to your device. Because optimization passes in general don't really give guarantees I think it's fine that the optimizer may miss some opportunities in a debug build.
I say that in part because using proper conditional compilation would make the code harder to maintain, to the point where we may require completely separate functions for when the flag is enabled or not. We've been trying to prevent completely separate implementations from happening so far: that won't always work but when we can I think the result is better.
I would be extremely surprised if any cfg-ed out code survived even in debug mode without optimizations. Even something like this:
fn main() { #[cfg(feature = "foo")] println!("hellorust"); }
Doesn't contain the
hellorust
string in the final binary in a debug build when the feature is disabled.
Right, note that this is different from:
fn main() {
if cfg!(feature = "foo") {
println!("hellorust");
}
}
since #[cfg(...)]
will prevent the code from even compiling, whereas this will compile the code but leave it inside a dead branch under if false
.
I say that in part because using proper conditional compilation would make the code harder to maintain, to the point where we may require completely separate functions for when the flag is enabled or not. We've been trying to prevent completely separate implementations from happening so far: that won't always work but when we can I think the result is better.
I agree here, mostly just wanted to ask because options do exist (like the currently unstable cfg_match
macro). And while you might not explicitly push debug code to a device, you might run tests on it off the device in debug mode, and it's important to ensure that these tests are accurate. I can imagine also wanting to compare generated assembly between debug and release mode to see what optimisations are done, without having the non-size-optimised code getting in the way.
Basically agree with the current implementation, but think it's important to properly clarify so folks are aware of the best way to implement things.
Ah, sorry, I didn't realize that with cfg!
it is different. Well, I would be also extremely surprised if if false
survived even in debug mode :laughing: But happy to see a counterexample.
Hi! I tested this option on my embedded project using embassy on a STM32F401 and my code size did go up from 335kB to 360kB. I can do some debugging/investigation why it happens if giving some instructions what to test for. Thanks!
Did it go up vs just using -Zbuild-std
without the optimize_for_size
feature, or versus a normal cargo build with a pre-built libstd?
Ah, I think I found the issue, adding -Z build-std-features="std/optimize_for_size"
to cargo build
overrides build-std-features
from cargo config.toml and that disabled panic_immediate_abort
when building my project and bumped the code size. When I add -Zbuild-std-features="std/optimize_for_size panic_immediate_abort"
my code size goes down:
with "std/optimize_for_size"
text data bss dec hex filename
339432 2800 35452 377684 5c354 init-stm32
without:
text data bss dec hex filename
340176 2800 35452 378428 5c63c init-stm32
Not a lot but it is something. Thanks!
Btw. small nitpick, why the reason for the std/ prefix? Since options like panic_immediate_abort
don't have it, sounds kind redundant when it is already in build-std-features
Hmm, that's weird, on CI we enable it with just optimize_for_size
, and in Cargo.toml
of stdlib it seems to be defined in the same way as panic_immediate_abort
(https://github.com/rust-lang/rust/pull/125011/files).
I think that optimize_for_size
is enough. This works for me:
$ cargo +nightly build -Zbuild-std -Zbuild-std-features="optimize_for_size,panic_immediate_abort" --target x86_64-unknown-linux-gnu
Not a lot but it is something. Thanks!
With what's currently merged, the max you can get is ~1.5kb savings on opt-level z. There's certainly more to be done for sure :)
Hmm, that's weird, on CI we enable it with just
optimize_for_size
, and inCargo.toml
of stdlib it seems to be defined in the same way aspanic_immediate_abort
(https://github.com/rust-lang/rust/pull/125011/files).I think that
optimize_for_size
is enough. This works for me:$ cargo +nightly build -Zbuild-std -Zbuild-std-features="optimize_for_size,panic_immediate_abort" --target x86_64-unknown-linux-gnu
Oh, right it works both way, doesn't metter if it starts with std/ or not. I assumed from the first posts it needs to start with std/. Thanks!
Sorry, that was an error on my part, fixed the code example. Thanks!
This definitely feels like something that could use a proper guide mentioning these sorts of caveats-- the unstable book would probably be a good place to put it for now. That way, we could update it as more cases like these pop up, and it would also help with the bit I mentioned about being able to have a general flag for "use size-optimised libstd" that makes sure the correct optimisation settings, etc. are applied to make it work. (Having a full list of the things people have to do to make it work, means we can take that into account when making a flag that "just works" for this case.)
Perhaps in the distant future, we could even see official builds of "size-optimised" libstd for certain architectures on rustup!
And we can also link that in the original comment so that folks can see the up-to-date version.
This has been added to min-sized-rust
to increase visibility. Thanks @Kobzol
https://github.com/rust-lang/rust/pull/125011 has added a
optimize_for_size
feature to the standard library, which is designed for special casing some algorithms incore
/alloc
/std
to provide an implementation that tries to minimize its effect on binary size.To use this feature, the standard library has to be compiled using
-Zbuild-std
:Optimizations that leverage this flag: