unitsofmeasurement / indriya

JSR 385 - Reference Implementation
Other
118 stars 40 forks source link

SimpleUnitFormat fails at parsing more complex units #275

Closed thomas-z96 closed 4 years ago

thomas-z96 commented 4 years ago

With more complex units, SimpleUnitFormat/DefaultFormat cannot parse() the output of format(). Example:

    @Test
    public void testFormatParse() {
        UnitFormat format = SimpleUnitFormat.getInstance();
        Unit<Acceleration> unit = MILLI(METRE_PER_SQUARE_SECOND);
        assertThat(format.parse(format.format(unit))).isEqualTo(unit);
    }

format() outputs "m(m/s²)", but parse() throws an exception: javax.measure.format.MeasurementParseException: unexpected token OPEN_PAREN at tech.units.indriya.format.SimpleUnitFormat$DefaultFormat.parseProductUnit(SimpleUnitFormat.java:583) at tech.units.indriya.format.SimpleUnitFormat.parseObject(SimpleUnitFormat.java:300) at tech.units.indriya.format.SimpleUnitFormat$DefaultFormat.parse(SimpleUnitFormat.java:838) at tech.units.indriya.format.SimpleUnitFormat$DefaultFormat.parse(SimpleUnitFormat.java:833) at tech.units.indriya.format.SimpleUnitFormat$DefaultFormat.parse(SimpleUnitFormat.java:824)

keilw commented 4 years ago

Thanks, but there's a reason why it's called "Simple", did you try especially EBNFUnitFormat with this test?

thomas-z96 commented 4 years ago

Hi, I also tried EBNFUnitFormat, but it looks like its format() method ignores prefixes, hence the test fails, too. So what is the purpose of SimpleUnitFormat if one cannot rely on format() and parse() to handle all units correctly, even if the documentation explicitly states that MetricPrefixes are recognized? Also JSR-385 says that "Bidirectional transformations between units and string representations (parsing and formatting) are performed by implementations of UnitFormat".

Does that mean that if I want to have a Unit -> String -> Unit conversion that works reliably, the indriya implementation does not provide this?

keilw commented 4 years ago

The format method ignores prefixes, that sounds weird. Would have to look into that. While it is not always bidirectional by design, the UCUMFormat in https://github.com/unitsofmeasurement/uom-systems/tree/master/ucum is probably the most sophisticated implementation. Please try that if you can. The format() output looks more than correct btw, because the braces are the only way to tell it apart from "mm/s²", which would be something different, so it is a bit of a corner case. And the simple out of the box implementation may not always cover every corner case. We'll check out both, if you say EBNFUnitFormat supresses the prefix, that sounds like a bug.

thomas-z96 commented 4 years ago

I also think that the simple format() output is fine, although one can argue that milli(metre/second^2) is quite the same as millimetre/second^2 in that specific case ;-) The parsing is the problem here as it interprets the first "m" as "metre" and then crashes when it sees the open parenthesis. When I played around with EBNFUnitFormat and tried UnitFormat formatter = EBNFUnitFormat.getInstance(); System.out.println(formatter.format(GIGA(METRE_PER_SECOND))); System.out.println(formatter.format(NEWTON)); System.out.println(formatter.format(MILLI(NEWTON))); the output is: m/s N kg·m/s² So prefixes also seem to cause problems with recognizing alternate units.

So for now, I'll stick with SimpleUnitFormat and restrict the units that are formatted to be system units (which is not a big restriction in my specific use case), as this seems to work fine. Maybe I'll have a look at the UCUMFormat you mentioned, too. Thanks!

keilw commented 4 years ago

I created #278, because the missing prefix in some or many cases is an issue. This one we have to see, if possible it would also be good to get it bi-directional, but it has a lower priority IMO.

keilw commented 4 years ago

@thomas-z96 #278 was fixed recently, it should be included in the next update release (2.0.5) so please try EBNFUnitFormat or LocalUnitFormat with the snapshot, I am not sure, if we'll enhance SimpleUnitFormat at least for now?