projectfluent / fluent-rs

Rust implementation of Project Fluent
https://projectfluent.org
Apache License 2.0
1.04k stars 95 forks source link

ICU4x number formatting #303

Open JasperDeSutter opened 1 year ago

JasperDeSutter commented 1 year ago

Adds basic decimal formatting for FluentNumber.as_string(). Many features are still missing like currency symbols and percentage formatting. #269

Since data providers need to be managed explicitly, it needs to be handed to the memoizer to be able to construct a formatter. This required a new argument + associated type on Memoizable.

Something to figure out is where the data comes from. I've went with a simple FluentBundle.set_number_format_provider, which seems most flexible to me. I've added a test that uses icu_testdata for this, but other providers should work as well. https://icu4x.unicode.org/doc/icu/#data-management

JasperDeSutter commented 1 year ago

Yeah I agree the increased binary size is the probably the main concern for incorporating ICU components.

The way I've implemented it currently allows for optionally providing the data as a user, falling back to basic number formatting when no provider is available. As such there shouldn't be a size increase from this part for consumers that don't want it. This does come with a usability impact for users of fluent-rs though, since they now need to know about ICU4X and its data generation tooling. We could opt for including the data behind a feature flag within fluent (the rust code generation solution looks quite nice for this), so that it becomes much easier for users to choose with or without.

The formatting code and supporting ICU libraries are unconditionally included though. I'm sure they have a sizable impact on binary size and compile time seeing the number of crates pulled in, I'll take a look into profiling binary sizes. We could use the same feature flag for conditionally depending on these libraries as for including the data.

Note that if we would want to also use ICU4X for plural rules or locale fallback, the story changes. Since these are required for the working of fluent message formatting we couldn't just put icu libraries behind a feature flag anymore.

JasperDeSutter commented 1 year ago

I compared the average binary size of all examples in fluent-bundle: (stripped release build)

605KB - without decimal formatting (main branch): 100% 643KB - with number formatting code, no data: +6% 780KB - with decimal and fallback data for all locales: +29% 716KB - with icu_testdata (icu_decimal feature, only test locales): +18%

To me it looks like a good idea to not include data, but I'm not sure whether a feature flag is warranted for the formatting code.

gregtatum commented 1 year ago

I'd like to get some feedback with other Mozillians in this space, and we're hitting the holidays where people are out.

Also to be clear here, I'd like to wait until after the holidays at least before merging since people are probably out since this affects the architecture of this project and Firefox. I think any code updates and reviews can happen before then, but I just wanted to set expectations.

gregtatum commented 10 months ago

I'm closing as stale since there were requested changes. Feel free re-open.

fee1-dead commented 1 month ago

any chance we could get a review on this?

alerque commented 1 month ago

@fee1-dead Sure, go ahead and review. Test results welcome. Only half jesting, if you want to help move things along the best way to do that is contribute by reviewing and testing yourself and reporting results, not nagging other volunteers.

Also be aware that because this introduces a bunch of new dependencies we'll probably have to hold off on merging until we can base it on or at least compare it with what is happening in #335. This work should jive pretty well with that, but it would be hard to give it a blank yes and merge before we have a solid idea how other ICU4X integration is going to work.