Open zbraniecki opened 4 years ago
I am of the opinion that crates are not the most effective way to go about modularization. I have written in wrapper-layer.md that I believe we can use dead code elimination to achieve modularization in a much more effective way.
I believe Shane is intending to separate the part that operates on the number (like rounding, tailoring etc.) into FixedDecimal and similar helper structs and effectively removing the benefit of clustering all types of numerical operations in a single formatter. @sffc can you share your take on this if I misunderstood you?
FixedDecimal is intended as a type that preserves leading and trailing zeros on input and output of NumberFormat, which is an important feature we largely lack in 402. Rounding operations cannot be split from NumberFormat because rounding depends on locale data for currencies, compact decimals, and measurement units.
2020-12-04 discussion:
More specifically, here is how I see the breakdown of features going into FixedDecimalFormat (lower level) versus KitchenSinkNumberFormat (higher level):
What: Pass-through formatter for FixedDecimal, applying localized symbols but no arithmetic.
Features:
* Sign display is slightly more complex, due to the requirement that we add affixes to the number. It may be slightly smaller if FixedDecimalFormat were "positive only", not capable of outputting a sign.
** Depends on the chosen design of #228
What: A larger, data-driven formatter supporting a larger set of UTS 35
Features:
* "Currency" encompasses currency spacing rules, currency rounding, symbol resolution, etc.
Rounding is a big chunk of the logic in ICU NumberFormatter. Unfortunately, it needs to be coupled with at least KitchenSinkNumberFormat, because the algorithm for selecting a compact form and applying a currency both require rounding the number based on locale data.
I filed #1441 to track currency formatting.
In terms of class structure / modularity: there are 2 main dimensions:
These are the two main dimensions we need to solve. The challenge is that these two dimensions can be combined freely, and when doing so, we may need to load different data or use different code paths.
For example:
Unit \ Notation | Decimal | Compact | Scientific | Spellout |
---|---|---|---|---|
None | 1000 | 1K | 1E3 | one thousand |
Currency | $1000.00 | $1K | $1.00E3 | one thousand dollars |
Percent | 1000% | 1K% | 1E3% | one thousand percent |
Measure | 1000 m | 1K m | 1E3 m | one thousand meters |
Within each box, there may be multiple display options as well, most often long/short/narrow.
Clearly there are some formats in this table that make more sense than others. But, we need to think about how to scale up to support this grid.
CC @robertbastian
In ICU (and ECMA402)
NumberFormat
becomes the jack of all trades with formatting for numbers, currencies, measuring units, and so on. There's even a drive from Shane to incorporate Pluralization as a feature of aNumberFormat
.Shane justified it by saying that all number formatters take similar options and similar operations to tailor the data.
The cost of such approach is that it becomes trickier to modularize such crates and
NumberFormat
becomes actually a pretty large codebase on a very fundamental level that is required by basically everything and fragile DCE is the only hope to keep the overhead lower.It is my impression that in ICU4X context we can bring that modularity back, and I believe Shane is intending to separate the part that operates on the number (like rounding, tailoring etc.) into
FixedDecimal
and similar helper structs and effectively removing the benefit of clustering all types of numerical operations in a single formatter. @sffc can you share your take on this if I misunderstood you?if that hypothesis is correct we have a way to get modular and lean
CurrencyFormatter
,MeasureUnitFormatter
,RuleBasedNumberFormatter
,RelativeTimeFormatter
,DurationFormatter
and so on into their own components and keep each one small without paying with an overhead when all of them are in use.This issue has been filled to discuss that and verify if we're all on the same page about how we want to tackle that topic.