Closed lbeaulac closed 3 years ago
Thanks for mentioning that. Would you have a PR for the factors if something is currently wrong or mismatching?
About the names, especially the CLDR system is strictly based on ICU4J MeasureUnit where US Customary units have no prefix or postfix in most cases while others like "IMPERIAL" or "SCANDINAVIAN" do. We are consistent with the Unicode standard here.
In Imperial
or USCustomary
there are a few mostly internal constants that are currently not publicly visible like FLUID_OUNCE_UK
while FLUID_OUNCE
is visible. I think we could clarify some of the members in the Imperial system, but I am not really convinced if we should change all the members in USCustomary. Only after a careful survey. Plus there will always be different standards and conventions we follow like UCUM (there almost every unit is included hence most of them have prefixes or namespaces) or Unicode.
I'm completely new to GitHub, so have no notion of how to submit a PR or suchlike. We don't use GitHub where I work.
The chief item that I'm trying to draw attention to is the incorrect multipliers being used for PINT and QUART, given that they are intended to be USCustomary units.
I understand the reluctance to deviate from the ICU4J nomenclature, but in cases where there can be ambiguity, like the parallel systems of liquid volume units, it then becomes incumbent on the maintainers of this package to clearly state in the Javadoc for each ambiguously-named canonical unit (PINT, QUART, TEASPOON, etc) which system the unit belongs to.
Can you point to the correct ones or are the ones in USCustomary correct in your opinion?
The Unicode CLDR module should remain independent, so neither Imperial
nor USCustomary
shall be used as a dependency. You are correct with QUART
, given ICU 68 introduced a QUART_IMPERIAL
, but for PINT
IMO that is a purely British term, hence unless ICU4J ever separates between PINT
and PINT_IMPERIAL
, the Pint is considered British and based on the Imperial multipliers.
HTH,
Werner
The US Customary package has a definition for PINT which is correctly set to equal 4 GILL_LIQUID, (and so equal to 16 FLUID_OUNCE, as I stated initially), so at least one source maintains that a PINT isn't a British-only quantity, as you seem to believe.
And despite what ICU4J decides or doesn't decide about the PINT, I submit that it's at least as important to be internally consistent as it is to "guess" at the intentions of an undecided external authority. A PINT that is equal to 20 US Ounces is just plain wrong: it doesn't match any official quantity.
One can argue that the unit naming convention ICU4J is following is to reserve the short-form name of a unit to always mean the US unit (where there is ambiguity), and employ a suffix to denote any alternate units of the same base name. Regrettably US-centric, IMO. But following that convention, one must conclude that a PINT is meant to be a US PINT (16 US Fluid Ounces), and that a QUART is similarly meant to be a US QUART (32 US Fluid Ounces).
Curiously, the US Customary package does not have a definition for QUART. This is odd, as things like milk are commonly sold in quarts here in the US.
As long as it doesn't also define a PINT_IMPERIAL
we could only make an assumption to which one ICU4J means, they don't explain that and while a different METRIC_PINT
is already defined (but that's 8 METRIC_CUP
each 250ml) it is completely ambiguous. ICU4J does not care about these factors. Plus the US has even a DRY_PINT
ICU4J also does not care at this point, so we make an assumption till it's changed in the ICU definition.
If you need to use both then you can always use USCustomary
.
The current definition matches https://en.wikipedia.org/wiki/Pint and there the Imperial Pint is the first/primary entry with at least 2 US pints being second and several other variations of the Pint in different countries, most of them historic.
1 Pint = 20 imperial fluid ounces
In the United Kingdom, the imperial pint is the mandatory base unit for draught beer and cider.[4] Milk sold in returnable containers (such as glass bottles) may be sold by the pint alone and other goods may be sold by the pint if the equivalent metric measure is also given.
So it's a more official unit there than elsewhere, but given QUART_IMPERIAL
was also just introduced as DRAFT in the latest ICU4J, I decided to slightly deviate here and offer a PINT_IMPERIAL
in addition to the PINT
. Hopefully ICU4J will also add it under that name. We keep ignoring their multiples although ICU4J recently came up with a "Complexity" (see https://github.com/unitsofmeasurement/indriya/issues/323) and also introduced a SIPrefix
enum as a draft, it has all those multiples like CENTILITER
or DECILITER
which we don't model this way since pretty much every unit can be combined with a prefix as long as the result makes sense. While ICU4J in its Complexity
states, that "you cannot set the power or SI prefix of a compound unit." Which are pretty much all derived unit types, therefore that kind of restriction seems inappropriate.
ICU is more about spelling and composing words, if you take "meter-per-second" and combine it with "milli" that could result in funny concatinations where it's hard to distinguish between "milli(meter-per-second)" and "millimeter-per-second", but for arithmetics it should be possible.
Closing this as I think the only two factors were QUART and PINT and both got a UK equivalent now, even though the PINT goes beyond the current ICU/CLDR definition.
Thank you for following this through. Chasing down all the myriad variations of liquid measures around the world would be a trip down the proverbial rabbit hole, but you've handily addressed the two obvious errors that I pointed out. Cheers. PS. If you ever order a pint of beer in Canada, make sure you get the mandated Imperial pint, and not the smaller US pint.
The definitions for non-SI liquid volume units in the CLDR class are inconsistent.
Both US and Imperial systems have the following internal relationships:
But the Imperial quantities for those four units are larger than their US counterparts by a constant factor of about 1.2.
There are also parallel definitions for the fluid ounce, tablespoon and teaspoon, but the ratios are different:
As can be seen above, the Imperial fluid ounce is actually smaller than its US counterpart, contrary to the larger volume units, where the Imperial unit is larger than the US unit. This is explained by differences in the number of fluid ounces that make up the larger units in the two systems. ie.
Now, in the CLDR class, we see these units defined:
public static final Unit<Volume> GALLON = addUnit(CUBIC_INCH.multiply(231));
public static final Unit<Volume> GALLON_IMPERIAL = addUnit(LITER.multiply(454609).divide(100000));
public static final Unit<Volume> FLUID_OUNCE = addUnit(GALLON.divide(128));
public static final Unit<Volume> CUP = addUnit(FLUID_OUNCE.multiply(8));
public static final Unit<Volume> PINT = addUnit(FLUID_OUNCE.multiply(20), "Pint", "pt", true);
public static final Unit<Volume> QUART = addUnit(FLUID_OUNCE.multiply(40), "Quart", "qt");
private static final Unit<Volume> MINIM = MICRO(LITER).multiply(61.61152d);
The issue here is that the PINT and QUART quantities are using the US fluid ounce as their base reference unit, but are using the multipliers for their Imperial quantities (20 and 40 respectively), rather than the US-specific multipliers (16 and 32).
There is also the question of whether to prefer one system over the other when choosing which should be assigned the canonical name for the unit. (ie. "GALLON" vs. "GALLON_US"). I submit that having a full set of units for each system would be appropriate and less confusing (ie. GALLON_US, QUART_US ... MINIM_US, as well as GALLON_IMPERIAL, QUART_IMPERIAL ... MINIM_IMPERIAL), though unfortunalty more verbose.
Finally, it strikes me that using two reference quantities in a single system (see GALLON and MINIM definitions above) runs the risk of discontinuities in calculations when approaching from either end. Wouldn't it be preferable to pick one reference quantity for a given system and derive all other units within that system from that one?