unitsofmeasurement / uom-systems

Units of Measurement Systems
http://www.uom.systems
Other
36 stars 17 forks source link

Parsing of Yard interpreted day #130

Closed lfoppiano closed 5 years ago

lfoppiano commented 5 years ago

Hi, I've also noticed that the class systems.uom.ucum.format.UCUMFormat.Parsing is parsing yd (yards) to yday -> day

keilw commented 5 years ago

Does that still occur in Systems UCUM 1.0 or the 2.0-SNAPSHOT?

lfoppiano commented 5 years ago

This problem occurs in version 0.9. Version 2.0-SNAPSHOT looks good.

keilw commented 5 years ago

Thanks, then I believe we can close that.

lfoppiano commented 4 years ago

Sorry, while testing #132 I noticed that 'yards' are actually not converted correctly (same problem as reported).

I'm using now:

    compile 'tech.units:indriya:2.0.3'
    compile group: 'si.uom', name: 'si-units', version: '2.0.1'
    compile group: 'si.uom', name: 'si-quantity', version: '2.0.1'
    compile group: 'systems.uom', name: 'systems-quantity', version: '2.0.1'
    compile group: 'systems.uom', name: 'systems-common', version: '2.0.1'
    compile group: 'systems.uom', name: 'systems-unicode', version: '2.0.1'
    compile group: 'systems.uom', name: 'systems-ucum', version: '2.0.1'

The UCUMFormat.Parsing is converting yd in yday.

Does it make sense to use the UCUMFormatter for these kind of units?

keilw commented 4 years ago

It does, but you must apply proper UCUM standards when using it, I guess you wanted to parse "yard", see http://unitsofmeasure.org/ucum.html#intcust that is either [yd_i] (case-sensitive) or [YD_I] (case-insensitive) to parse, while "yd" results in "Yottaday" ;-) UCUM doesn't allow to parse PRINT formatted units back via another formatting variant, that is specified this way because without the special UCUM codes there is always a risk of ambiguity and overlap.

lfoppiano commented 4 years ago

OK. Thanks for the details. So if I understood correctly I can't use yd but i should use yd_i when it's referred to the yards.

I tried to pass yd_i instead of yd to the parser, but it doesn't seems to like it so much:

! systems.uom.ucum.internal.format.TokenException: null
! at systems.uom.ucum.internal.format.UCUMFormatParser.SimpleUnit(UCUMFormatParser.java:207)
! at systems.uom.ucum.internal.format.UCUMFormatParser.Annotatable(UCUMFormatParser.java:180)
! at systems.uom.ucum.internal.format.UCUMFormatParser.Component(UCUMFormatParser.java:122)
! at systems.uom.ucum.internal.format.UCUMFormatParser.Term(UCUMFormatParser.java:77)
! at systems.uom.ucum.internal.format.UCUMFormatParser.parseUnit(UCUMFormatParser.java:67)
! at systems.uom.ucum.format.UCUMFormat$Parsing.parse(UCUMFormat.java:508)
! at systems.uom.ucum.format.UCUMFormat$Parsing.parse(UCUMFormat.java:526)

it fails here:


final public Unit SimpleUnit() throws TokenException {
        Token token = null;
        token = jj_consume_token(ATOM);
        Unit unit = symbols.getUnit(token.image);
        if (unit == null) {
            Prefix prefix = symbols.getPrefix(token.image);
            if (prefix != null) {
                String prefixSymbol = symbols.getSymbol(prefix);
                unit = symbols.getUnit(token.image.substring(prefixSymbol.length()));
                if (unit != null) {
                    {
                        return unit.transform(MultiplyConverter.ofPrefix(prefix));
                    }
                }
            }
            {
                throw new TokenException();
            }

the unit is null. Here the debug information

image

keilw commented 4 years ago

Please have a look at UCUMFormatDemo line 42 and below. It must be precisely "[yd_in]" with the brackets, because "yd_in" is the print output. Don't mind the first output in the line that says "Parsing", those are using SimpleUnitFormat and there is a TODO in the UCUM class about that. Adding the exact same string as label to all of them is possible, but tedious, if someone wishes to offer a PR here happy about that. However, inside the UCUM system all results are consistent with each other and the last line shows that parsing "[yd_in]" with a case-sensitive variant and "[YD_I]" with the case-insensitive one (therefore "[yD_I]" and all case variations also work for that) create two equal Unit instances. This is the British Imperial Yard btw, "[yd_us]" does the same for US Yards.

lfoppiano commented 4 years ago

All right. I understood. Thanks!