unitsofmeasurement / indriya

JSR 385 - Reference Implementation
Other
115 stars 40 forks source link

Support parsing quantity strings for specific locale #351

Closed wborn closed 3 years ago

wborn commented 3 years ago

It would be helpful if support is added for parsing strings using a specific locale by having a method similar to Quantities.getQuantity(CharSequence) that has an additional Locale parameter.

It could then for instance create a SimpleQuantityFormat for that locale which would use the specified Locale in SimpleQuantityFormat.parse for parsing strings by creating a NumberFormat based on the specified Locale instead of the default Locale. The default locale can change depending on what is configured in the OS or if code calls Locale.setDefault.

Besides being able to have more deterministic parsing behavior, it can also be used when users enter a String in a UI based on their own Locale (different from the default JVM Locale).

keilw commented 3 years ago

Please note, that already exists, but you should use LocalUnitFormat instead. Quantities.getQuantity(CharSequence) is mainly a convenience method beside the actual factories like getQuantity(Number, Unit) so adding a locale-specific helper there I think would be an overkill.

You can use ServiceProvider.current().getFormatService().getQuantityFormat("Local") to get a localized QuantityFormat for the current locale via SPI. Everything else beyond that is possible via NumberDelimiterQuantityFormat.Builder also see #349 which was about passing a CompactNumberFormat from Java 12 on, but you can use that builder to create the UnitFormat and NumberFormat of your own choice, if those predefined via SPI are not sufficient. Does this answer your question?

keilw commented 3 years ago

Btw @wborn I see you're based in the Netherlands, I understand @thodorisbais is now busy with other duties, but maybe if he can still help his JUG at least after COVID allows that again you may also get together e.g. at a MeetUp or something.

wborn commented 3 years ago

Thanks for your quick answers and showing me around the SPI @keilw. :+1: The configuration options on the NumberDelimiterQuantityFormat look very promising.

We ran into some parsing issues of the new default locale dependent behavior of Quantities.getQuantity after upgrading openHAB from javax.measure 1.x to 2.x

Creating the following format seems to get back the more deterministic behaviour that was once provided by tec.uom.se.quantity.Quantities.getQuantity:

            NumberDelimiterQuantityFormat format = new NumberDelimiterQuantityFormat.Builder()
                    .setNumberFormat(NumberFormat.getInstance(Locale.ENGLISH))
                    .setUnitFormat(SimpleUnitFormat.getInstance()).setLocaleSensitive(false).build();

If I use this format all the unit tests still succeed but that doesn't always guarantee the absence of issues. :innocent:

Do you think that would be very similar to using the SimpleQuantityFormat like used with Quantities.getQuantity or would we need to use other configuration options to prevent issues? This way it would also be easy to change the locale of the number format.

Also it is not really clear to me how the localeSensitive option is (or will be) used. I did not find any code that uses it. We probably need to set it to false when creating formats depending on a specific locale?

keilw commented 3 years ago

SimpleUnitFormat is completely locale-agnostic. And localeSensitive only returns true, if the Locale may influence the output or parsing for a particular format. So

            NumberDelimiterQuantityFormat format = new NumberDelimiterQuantityFormat.Builder()
                    .setNumberFormat(NumberFormat.getInstance(Locale.ENGLISH))
                    .setUnitFormat(SimpleUnitFormat.getInstance()).setLocaleSensitive(false).build();

seems misleading, because localeSensitive there should be true, since you apply an ENGLISH Locale. I admit it is a bit tricky, because the NumberFormat unlike UnitFormat does not provide any proper information about the Locale it was created with. So QuantityFormat.isLocaleSensitive() is a bit of a meta-information which is why the Builder allows to adjust it, because the only place it can be derived from is the underlying UnitFormat, but if the NumberFormat is locale-sensitive you need to tweak that in the code.

wborn commented 3 years ago

Thanks for the info! I'll close the issue since I think it will work this way. :slightly_smiling_face:

keilw commented 3 years ago

Thanks for closing it, I made a small adjustion to use the flag of the UnitFormat, but as mentioned with a NumberFormat it is up to your code to tweak it and primarily a "warning sign" if you want, how that formatter behaves.