averbraeck / djunits

Delft Java UNIT System for using strongly-typed quantities and units
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

Parsing of Scalar using a Locale is not working properly #4

Closed averbraeck closed 1 year ago

averbraeck commented 1 year ago

In the LocaleDemo class, I added the following code: for testing:

        System.out.println("\nParsing UK");
        Locale.setDefault(new Locale("en", "UK"));
        Speed speed = Speed.valueOf("14.2 km/h");
        System.out.println(speed.toTextualString());
        System.out.println(speed.toDisplayString());
        System.out.println(speed);

        System.out.println("\nParsing NL");
        Locale.setDefault(new Locale("nl", "NL"));
        speed = Speed.valueOf("14.2 km/u");
        System.out.println(speed.toTextualString());
        System.out.println(speed.toDisplayString());
        System.out.println(speed);

The output is:

Parsing UK
14.2 km/h
14.2 km/h
14.2000000 km/h

Parsing NL
Exception in thread "main" java.lang.IllegalArgumentException: Error parsing Speed from 14.2 km/u
    at org.djunits.value.vdouble.scalar.Speed.valueOf(Speed.java:196)
    at org.djunits.demo.examples.LocaleDemo.main(LocaleDemo.java:55)
averbraeck commented 1 year ago

LocaleDemo now correctly parses localized strings:

        System.out.println("\nPrinting US");
        Locale.setDefault(Locale.US);
        Duration hour = new Duration(3.0, DurationUnit.HOUR);
        System.out.println(hour.toTextualString());
        System.out.println(hour);

        System.out.println("\nPrinting NL");
        Locale.setDefault(Locale.forLanguageTag("NL"));
        System.out.println(hour.toTextualString());
        System.out.println(hour);

        System.out.println("\nParsing UK");
        Locale.setDefault(new Locale("en", "UK"));
        Speed speed = Speed.valueOf("14.2 km/h");
        System.out.println(speed.toTextualString());
        System.out.println(speed.toDisplayString());
        System.out.println(speed);

        try
        {
            speed = Speed.valueOf("14.2 km/u");
            System.err.println("WRONG, should not be able to parse 14.2 km/u in UK locale");
        }
        catch (Exception e)
        {
            System.out.println("Correctly failed to parse 14.2 km/u in UK locale");
        }

        System.out.println("\nParsing NL");
        Locale.setDefault(new Locale("nl", "NL"));
        speed = Speed.valueOf("14.2 km/u");
        System.out.println(speed.toTextualString());
        System.out.println(speed.toDisplayString());
        System.out.println(speed);

        try
        {
            speed = Speed.valueOf("14.2 km/z");
            System.err.println("WRONG, should not be able to parse 14.2 km/z");
        }
        catch (Exception e)
        {
            System.out.println("Correctly failed to parse 14.2 km/z");
        }

produces:

Printing US
3.0 h
3.00000000 h

Printing NL
3.0 h
3,00000000 h

Parsing UK
14.2 km/h
14.2 km/h
14.2000000 km/h
Correctly failed to parse 14.2 km/u in UK locale

Parsing NL
14.2 km/h
14.2 km/h
14,2000000 km/h
Correctly failed to parse 14.2 km/z
averbraeck commented 1 year ago

For now, parsing seems to work well. Will build some unit tests to check the different configurations and changes.

averbraeck commented 1 year ago

Parsing using localized units works, but using decimal comma / point belonging to the Locale not yet:

        Locale.setDefault(new Locale("nl", "NL"));
        speed = Speed.valueOf("14,2 km/u");

yields:

Exception in thread "main" java.lang.IllegalArgumentException: Error parsing Speed from 14,2 km/u
    at org.djunits.value.vdouble.scalar.Speed.valueOf(Speed.java:196)
    at org.djunits.demo.examples.LocaleDemo.main(LocaleDemo.java:69)
averbraeck commented 1 year ago

This is not easy. Double.parseDouble() does not use the current Locale. NumberFormat can parse numbers, but it is a bit strict.

For the rest, the DecimalFormat class works fine for parsing. A solution might be to remove a +-sign at the start of the number and after the exponent separator; and to convert a 1-character exponent into the right case. Note that the exponent separator is a string that could consist of multiple characters.

averbraeck commented 1 year ago

This test piece of code seems to do the job.

private static void parse(final String text)
{
    NumberFormat formatter = DecimalFormat.getInstance();
    if (formatter instanceof DecimalFormat)
    {
        // remove a potential plus at the start and a potential plus after the exponent sign,
        // and put the exponent sign n the right case. But the character of the exponent sign
        // can occur in the unit string (e.g., 'dyne'), so the original case should be preserved.
        String noPlus = text.startsWith("+") ? text.substring(1) : text;
        String changedExp = noPlus;
        DecimalFormat df = (DecimalFormat) formatter;
        String exponentSeparator = df.getDecimalFormatSymbols().getExponentSeparator();
        if (exponentSeparator.length() == 1)
        {
            int expIndexLo = noPlus.indexOf(exponentSeparator.toLowerCase());
            int expIndexUp = noPlus.indexOf(exponentSeparator.toUpperCase());
            int expIndex = Math.min(expIndexLo, expIndexUp) > 0 ? Math.min(expIndexLo, expIndexUp)
                    : Math.max(expIndexLo, expIndexUp);
            if (expIndex != -1)
            {
                if (expIndex + 1 < noPlus.length() && noPlus.charAt(expIndex + 1) == '+')
                {
                    noPlus = noPlus.substring(0, expIndex) + exponentSeparator + noPlus.substring(expIndex + 2);
                    changedExp = noPlus;
                }
                else
                {
                    changedExp = changedExp.substring(0, expIndex) + exponentSeparator + noPlus.substring(expIndex + 1);
                }
            }
        }
        ParsePosition pp = new ParsePosition(0);
        Number n = df.parse(changedExp, pp);
        double d = (n == null) ? Double.NaN : n.doubleValue();
        String unitString = noPlus.substring(pp.getIndex()).trim();
        System.out.println("Locale " + Locale.getDefault().getCountry() + " NumberFormat.parse(" + text + ") => " + d
                + ", parsePosition = " + pp.getIndex() + ", rest = " + unitString);
    }
}
averbraeck commented 1 year ago

Changes for issue have been pushed to the main branch.