rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.33k stars 12.72k forks source link

Escaping `char` in libcore adds 2k of static data for no_std cases #39492

Open aturon opened 7 years ago

aturon commented 7 years ago

The PR to improve char escaping unfortunately added 2k of static data to libcore, which impacts the no_std use case on small devices. Even if in some cases this data could be eliminated automatically, if you ever format a character, you'll definitely bring these tables in.

We should see whether there's a way to get this functionality while moving the bloat to libstd, perhaps using specialization.

aturon commented 7 years ago

cc @rust-lang/libs @SimonSapin

This was raised in @SimonSapin's recent comment. I'd like to propose a general policy that we take care in libcore to optimize for space, given that it's the basis for working on small embedded devices.

Nominating for discussion in next triage meeting.

hanna-kruppe commented 7 years ago

Any solution that removes the functionality from core will regress string formatting in no_std crates. This should be obvious but I want to spell out that this affects not only custom code for small embedded devices, but also many other crates — up to and including regular libraries (that just happen to be no_std) linked into desktop applications. Furthermore, core and std would have different behavior for the same code, which would be quite confusing and AFAIK is unprecedented.

aturon commented 7 years ago

@rkruppe All fair points. We need to figure out the right way to balance between these several legit concerns.

hanna-kruppe commented 7 years ago

Re: a more general policy to optimize libcore for size, one area of libcore that eats a lot of binary space is float formatting and parsing. Together, they account for 60+ kilobytes on top of the previous, naive implementations (assuming the measurements reported in the respective PRs #24612 #27307 are still accurate — but there were few changes to the code since then). While this could be optimized somewhat, these tasks fundamentally require a lot of tricky algorithmic work and pre-computed data to be both efficient and correct. (But again, it could probably be shrunk quite a bit, so if someone cares enough to try and improve this, be my guest! I have some ideas.)

I bring this up both out of fairness (if anything should be cut, these are prime candidates) and to bring those two kilobytes into perspective.

eddyb commented 7 years ago

Btw, the implementation also appears to only use RLE and not much hierarchically, or any bitsets dor that matter, so it could perhaps use up less space.

SimonSapin commented 7 years ago

@eddyb I tried to use std_unicode’s BoolTrie instead but that ~doubled the size… (There’s a few very large runs where RLE helps a lot.)

To deal with competing concerns, I think that Unicode-aware escaping should probably be the default but there should be some way to opt out of it for the niche cases where code size is constrained. Now that the build system is based on Cargo, could libcore have a default Cargo feature for this? Disabling it would require building your own libcore, but that’s already required at the moment for (at least some) embedded targets.

aturon commented 7 years ago

@SimonSapin Cargo features for core are a really good idea for dealing with this tradeoff. :+1:

alexcrichton commented 7 years ago

The libs team discussed this issue during triage today and the conclusion was that @SimonSapin's idea of a feature on libcore is great! We're totally on board with such a Cargo feature which trims down the size of many libcore routines (sometimes at the cost of correctness in the case of float parsing)

We're not likely to proactively add such feature, but PRs doing so would be most welcome!

tbu- commented 6 years ago

I don't know how I'd select features from core. Is there infrastructure in place that allows crates to do that?

If that's not the case, a PR adding default features to core won't have much impact.

oli-obk commented 6 years ago

You can use xargo to build your own libcore with any configuration settings you desire (e.g. no landing pads and panic = abort and similar)

steveklabnik commented 5 years ago

Triage: I'm not aware of any movement on this issue since Alex posted the path forward.

SimonSapin commented 5 years ago

We’ve since added more Unicode-related functionality to libcore (and do not have a libstd_unicode crate anymore), on the basis that the linker should eliminate from the final binary any table that is unused.

I suppose this issue is relevant for cases where <str as fmt::Debug>::fmt is used, and space is very constrained. However there is probably not much to do here until std-aware Cargo (and specifically https://github.com/rust-lang/wg-cargo-std-aware/issues/4) gets further along.