[libc++] Feature Request - Code tweaks and build flags to produce smaller binaries when using <format> on embedded systems

llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

http://llvm.org

Other

27.96k stars 11.54k forks source link

[libc++] Feature Request - Code tweaks and build flags to produce smaller binaries when using <format> on embedded systems #93646

Open BlamKiwi opened 3 months ago

BlamKiwi commented 3 months ago

We currently use an internal format library that's a "lean" implementation of the standard for embedded devices. I've been trying to test libc++ <format> on embedded systems with very little RAM. There's some tweaks that can be made to make libc++ a bit leaner.

The C++ standard specifies that float, double and long double are explicitly listed types in std::basic_format_arg. This causes the compiler to pull in all float formatting code and associated lookup tables. This can cause binaries to bloat by >200KB even though it's never used. It is a common feature of embedded libc implementations to provide a compile option to disable float support in printf/scanf. It would be nice if libc++ provided a build flag to disable float types in <format> or somehow map floats to the handle type to implement a compiler firewall.

<charconv> uses lookup tables or dedicated code paths to efficiently convert base 10, base 2, etc. It would be good if optimized codepaths were guarded by __OPTIMIZE_SIZE__ (or equivalent libc++ define) to prevent bringing in this code.

__throw_invalid_option_format_error and __throw_invalid_type_format_error use basic_string and causes the contents of string.cpp to be pulled in. If exceptions are disabled, it would be nice if this message was simplified to a simple c string.

__format_escaped_char and __format_escaped_string use a basic_string instead of just passing through the output iterator. Is there a reason this is the case? Directly passing in the iterator would prevent basic_string usage where it would otherwise not be used.

frederick-vs-ja commented 3 months ago

__format_escaped_char and __format_escaped_string use a basic_string instead of just passing through the output iterator. Is there a reason this is the case? Directly passing in the iterator would prevent basic_string usage where it would otherwise not be used.

See https://reviews.llvm.org/D134036 although I didn't found why a basic_string buffer is wanted. I guess we can use a much simpler buffer type. CC @mordante.

mordante commented 2 months ago

__format_escaped_char and __format_escaped_string use a basic_string instead of just passing through the output iterator. Is there a reason this is the case? Directly passing in the iterator would prevent basic_string usage where it would otherwise not be used.

See https://reviews.llvm.org/D134036 although I didn't found why a basic_string buffer is wanted. I guess we can use a much simpler buffer type. CC @mordante.

I don't recall either and maybe it's no longer/not needed.

I feel this request has merits, however I think we should look at the bigger picture. If no floating-point is available we should not have to_chars for floating-point either. This code contains several large lookup tables that is not needed. At the moment I've quite a long list of items I want to work on, so I don't expect to have time to look at this soon.

ldionne commented 2 months ago

I feel this request has merits, however I think we should look at the bigger picture. If no floating-point is available we should not have to_chars for floating-point either. This code contains several large lookup tables that is not needed.

I agree about the big picture. I think a lot of the requests in this issue make sense. IMO if we want to make progress on this, we would need someone to write a short RFC explaining how we're going to tackle this, what is going to be affected and how, etc.

I'm going to un-assign you for now since you don't think you'll have time to work on this. @BlamKiwi if you have interest for pursuing this, I think that is the most likely way for this issue to make progress.

BlamKiwi commented 2 months ago

@mordante Are you suggesting that you would prefer a compile flag to just disable float support entirely in libc++?

I agree that would probably be the useful baseline. The question would then be should disabling just float <format> be a supported use case?

The pain point is paying for float formatting when you're not actually formatting floats. It is still nice to be able to prototype math with floating point values before moving to something like fixed point arithmetic. I can understand not wanting to support this edge-case though since its pretty niche.

Or to state it more plainly, implement a solution in such a way that supporting floats is orthogonal to picking different code-paths to optimize size.

@ldionne Regarding to what the consequences are to hard disabling float support, I think the C world provides the most useful example. printf, scanf etc just don't support float anymore. Things like <cmath> leads to compile/linker errors. The user experience is a bit poor, since any use of float breaks. Given only embedded/kernel development has this use-case though it's probably fine.

On the C++ side, there's annoying edge cases like std::unordered_map using floats in its API. These would probably also be disabled or replaced with an extension to the standard. e.g., a fixed point arithmetic value.

Regarding the other minor optimizations, I assume explicit build flags for modifying behaviour are preferred over compiler flags. e.g., Adding something like LIBCXX_FORMAT_OPTIMIZE_SIZE over using __OPTIMIZE_SIZE__

ldionne commented 2 months ago

@BlamKiwi If there is a way to "delay" the instantiation of format for floating point until when we actually need it, then this could be a change we implement while being conforming today, without considering the larger change about floating points. But I'm not certain that's feasible.

Another option worth considering is stuff like link-time optimization. If you want to reduce code size, perhaps this can help dead-strip code paths in format that are never used?

BlamKiwi commented 2 months ago

@BlamKiwi If there is a way to "delay" the instantiation of format for floating point until when we actually need it, then this could be a change we implement while being conforming today, without considering the larger change about floating points. But I'm not certain that's feasible.

Another option worth considering is stuff like link-time optimization. If you want to reduce code size, perhaps this can help dead-strip code paths in format that are never used?

We already use LTO where possible. I will take a second look though. Given how simple the other fixes are, it would be good to get an answer for this.

mordante commented 2 months ago

@mordante Are you suggesting that you would prefer a compile flag to just disable float support entirely in libc++?

I agree that would probably be the useful baseline. The question would then be should disabling just float <format> be a supported use case?

IMO no. For the question "should we allow libc++ to work without floating-point support" my answer would be yes.

The pain point is paying for float formatting when you're not actually formatting floats. It is still nice to be able to prototype math with floating point values before moving to something like fixed point arithmetic. I can understand not wanting to support this edge-case though since its pretty niche.

Only disabling it for <format> still means you pay the price for to_chars and stream operations. (The latter can be disabled by disabling locale support.) Both to_chars and stream operations have their implementation in the dylib so even using C++11 will add the overhead of the floating-point tables of to_chars. So my objection is purely about adding a special case for <format> and not disable floating-point in general or at least all "floating-point conversion functions" feels wrong to me.

BlamKiwi commented 2 months ago

dylib

I think this is where the misunderstanding is coming from. Embedded work largely uses LTO, -ffunction-sections and static linking. We are not paying these costs because we are not manually invoking those functions, and they're discarded by the linker. Our codebases don't use streams or std::to_chars<float> already, precisely because of their costs.

<format> by its current implementation pulls in a bunch of stuff it doesn't actually need under a static linking environment. One of those things happens to be float formatting because of the implementation of std::basic_format_arg.

mordante commented 2 months ago

We are not paying these costs because we are not manually invoking those functions, and they're discarded by the linker. Our codebases don't use streams or std::to_chars<float> already, precisely because of their costs.

Here you describe your use-case and the feature you describe perfectly fits your use-case. However in general we try to add features that are usable for a larger set of users. That's why I don't want to add a special case for your use-case. The next user might want to use shared objects and ostreams but without floating-point support.

BlamKiwi commented 2 months ago

That's fine. What I'm wanting to pursue is an improvement to <format> code-gen that will help all users using libc++ in a space constrained environment. Floating point formatting is just our particular pain point. Users on 32-bit space constrained systems might also balk at the fact 64-bit integer formatting is also pulled in. We actually do have some targets (Xtensa) with this constraint. You already have flags to do similar things like Unicode support. I just want buy-in before I go put together a merge request.

The main issue with your current implementation is you defer std::basic_format_arg format codegen until you're actually inside __handle_replacement_field. This seems to prevent the compiler/linker from knowing that particular code paths aren't actually used, especially at lower optimization levels.

I need to investigate some of the code output a bit more, but it should be possible (in a standards compliance sense) to add a private field to std::basic_format_arg which is just a type erased format function of the current argument type. This eliminates the visitor pattern usage inside __handle_replacement_field and should optimize the code-gen for minimizing size. This would fix our specific floating-point gripe, but this optimization would apply to all types listed in the standard.

This change would trade-off some runtime overhead for minimizing code size, so I would guard this implementation with a build flag. Something along the lines of LIBCXX_FORMAT_MINIMIZE_SIZE. The main issue with this change though is the resulting binary wouldn't be ABI compatible with a library built without this flag.

EDIT: I'm not saying that removing float support from libc++ as an explicit flag is a bad idea. More that improving codegen for <format> is worth doing on its own.

mordante commented 2 months ago

I agree if you have a way to improve the codegen of <format> I'd be happy to review it. There are some benchmarks for std::format which allows to determine what the performance overhead of such a solution would be.

BlamKiwi commented 2 months ago

See the associated merge request for the main code size optimization. The overhead of using pointer dispatch is ~5% on my laptop. 5% could be significant for some users, so it's probably worth guarding this with a build flag. The benefit is that only formatters actually used get instantiated.

I still need to look at the escape functions to get rid of the basic_string usage. Unfortunately the only practical way to remove the use of a temporary buffer would be to iterate over the string twice. Once to count the escaped printable string width and the second time to actually escape and pad the string.