Open aturon opened 7 years ago
cc @rust-lang/libs @SimonSapin
This was raised in @SimonSapin's recent comment. I'd like to propose a general policy that we take care in libcore
to optimize for space, given that it's the basis for working on small embedded devices.
Nominating for discussion in next triage meeting.
Any solution that removes the functionality from core will regress string formatting in no_std
crates. This should be obvious but I want to spell out that this affects not only custom code for small embedded devices, but also many other crates — up to and including regular libraries (that just happen to be no_std
) linked into desktop applications. Furthermore, core and std would have different behavior for the same code, which would be quite confusing and AFAIK is unprecedented.
@rkruppe All fair points. We need to figure out the right way to balance between these several legit concerns.
Re: a more general policy to optimize libcore for size, one area of libcore that eats a lot of binary space is float formatting and parsing. Together, they account for 60+ kilobytes on top of the previous, naive implementations (assuming the measurements reported in the respective PRs #24612 #27307 are still accurate — but there were few changes to the code since then). While this could be optimized somewhat, these tasks fundamentally require a lot of tricky algorithmic work and pre-computed data to be both efficient and correct. (But again, it could probably be shrunk quite a bit, so if someone cares enough to try and improve this, be my guest! I have some ideas.)
I bring this up both out of fairness (if anything should be cut, these are prime candidates) and to bring those two kilobytes into perspective.
Btw, the implementation also appears to only use RLE and not much hierarchically, or any bitsets dor that matter, so it could perhaps use up less space.
@eddyb I tried to use std_unicode
’s BoolTrie
instead but that ~doubled the size… (There’s a few very large runs where RLE helps a lot.)
To deal with competing concerns, I think that Unicode-aware escaping should probably be the default but there should be some way to opt out of it for the niche cases where code size is constrained. Now that the build system is based on Cargo, could libcore have a default Cargo feature for this? Disabling it would require building your own libcore, but that’s already required at the moment for (at least some) embedded targets.
@SimonSapin Cargo features for core are a really good idea for dealing with this tradeoff. :+1:
The libs team discussed this issue during triage today and the conclusion was that @SimonSapin's idea of a feature on libcore is great! We're totally on board with such a Cargo feature which trims down the size of many libcore routines (sometimes at the cost of correctness in the case of float parsing)
We're not likely to proactively add such feature, but PRs doing so would be most welcome!
I don't know how I'd select features from core
. Is there infrastructure in place that allows crates to do that?
If that's not the case, a PR adding default features to core
won't have much impact.
You can use xargo
to build your own libcore
with any configuration settings you desire (e.g. no landing pads and panic = abort and similar)
Triage: I'm not aware of any movement on this issue since Alex posted the path forward.
We’ve since added more Unicode-related functionality to libcore (and do not have a libstd_unicode
crate anymore), on the basis that the linker should eliminate from the final binary any table that is unused.
I suppose this issue is relevant for cases where <str as fmt::Debug>::fmt
is used, and space is very constrained. However there is probably not much to do here until std-aware Cargo (and specifically https://github.com/rust-lang/wg-cargo-std-aware/issues/4) gets further along.
The PR to improve char escaping unfortunately added 2k of static data to libcore, which impacts the
no_std
use case on small devices. Even if in some cases this data could be eliminated automatically, if you ever format a character, you'll definitely bring these tables in.We should see whether there's a way to get this functionality while moving the bloat to
libstd
, perhaps using specialization.