Unidata / UDUNITS-2

API and utility for arithmetic manipulation of units of physical quantities
http://www.unidata.ucar.edu/software/udunits
Other
62 stars 36 forks source link

Issue w/ comparison of abbreviated + plural unit for "foot" and "inch" #113

Closed elipousson closed 1 year ago

elipousson commented 1 year ago

I initially reported this issue in the R package units but the maintainers rightly noted that the issue upstream with this library. I reviewed the existing issues and didn't see any prior reports flagging the same issue (with the possible exception of #39).

There appear to be some inconsistencies in the handling of plural and abbreviated units "inch" and "foot" units. I initially thought this may just be how the pre-defined units from UDUNITS2 work but I noticed that "foot" and "feet" are listed as valid name_singular_aliases and name_plural_aliases in the data.frame returned by units::valid_udunits() (and plural forms that work like degrees are not) so I'm unsure why these comparisons work the way they do.

Here is a reprex created using the R units package:

# Make a helpful function to compare character strings as units
is_same_units <- function(x, y) {
  units::as_units(x) == units::as_units(y)
}

# These comparisons work as expected

# Plural form
is_same_units("kilometer", "kilometers")
#> [1] TRUE
is_same_units("mile", "miles")
#> [1] TRUE
is_same_units("degree", "degrees")
#> [1] TRUE

# Abbreviation
is_same_units("km", "kilometer")
#> [1] TRUE
is_same_units("mi", "mile")
#> [1] TRUE

# These comparisons do not work as expected

is_same_units("inch", "inches")
#> [1] FALSE
is_same_units("international_inch", "international_inches")
#> [1] FALSE
is_same_units("foot", "feet")
#> [1] FALSE

is_same_units("in", "inch")
#> [1] FALSE
is_same_units("in", "international_inch")
#> [1] FALSE
is_same_units("ft", "foot")
#> [1] FALSE

Created on 2022-12-18 with reprex v2.0.2

Here is my basic environmental information:

platform       aarch64-apple-darwin20                     
arch           aarch64                                    
os             darwin20                                   
system         aarch64, darwin20                          
status         Patched                                    
major          4                                          
minor          2.0                                        
year           2022                                       
month          05                                         
day            23                                         
svn rev        82396                                      
language       R                                          
version.string R version 4.2.0 Patched (2022-05-23 r82396)
nickname       Vigorous Calisthenics 

I was unsure the best way to report the version of UDUNITS I have installed but, since I have it installed with Homebrew, this is where my database is installed: /opt/homebrew/Cellar/udunits/2.2.28/share/udunits/udunits2-common.xml

Enchufa2 commented 1 year ago

And here's a minimal reproducible example of the core issue:

#include <stdio.h>
#include <udunits2/udunits2.h>

int main() {
    ut_set_error_message_handler((ut_error_message_handler) ut_ignore);
    ut_system *sys = ut_read_xml(NULL);
    ut_encoding enc = UT_UTF8;
    ut_set_error_message_handler((ut_error_message_handler) vprintf);

    double val_from = 1.0, val_to;
    char *from = "foot", *to = "feet";
    ut_unit *u_from = ut_parse(sys, from, enc);
    ut_unit *u_to = ut_parse(sys, to, enc);
    cv_converter *cv = ut_get_converter(u_from, u_to);
    cv_convert_doubles(cv, &val_from, 1, &val_to);

    printf("From: %.20f %s\n", val_from, from);
    printf("To  : %.20f %s\n", val_to, to);

    cv_free(cv);
    ut_free(u_from);
    ut_free(u_to);
    ut_free_system(sys);
    return 0;
}

I get:

$ gcc test.c -ludunits2
$ ./a.out 
From: 1.00000000000000000000 foot
To  : 0.99999999999999988898 feet
semmerson commented 1 year ago

Hi Eli & Iñaki,

I don't know how R is implementing the "==" operator between units. Hopefully, it uses the function ut_compare().

I added the following tests of ut_compare() to the package's unit-test:

ut_unit* inch = ut_parse(xmlSystem, "inch", UT_ASCII);
CU_ASSERT_PTR_NOT_NULL(inch);
ut_unit* inches = ut_parse(xmlSystem, "inches", UT_ASCII);
CU_ASSERT_EQUAL(ut_compare(inch, inches), 0);
ut_free(inch);
ut_free(inches);

ut_unit* foot = ut_parse(xmlSystem, "foot", UT_ASCII);
CU_ASSERT_PTR_NOT_NULL(foot);
ut_unit* feet = ut_parse(xmlSystem, "feet", UT_ASCII);
CU_ASSERT_EQUAL(ut_compare(foot, feet), 0);
ut_free(foot);
ut_free(feet);

In both cases, the singular and plural forms compared equal.

0.2 can't be represented as a fixed-length, floating-point value, so I'm not surprised that the converter doesn't do a perfect job.

--Steve

elipousson commented 1 year ago

Thanks, @semmerson – I'll re-open the issue on the units package repository and see if it is possible to incorporate ut_compare() into the implementation of ==. I don't think the package is using that currently although I could be wrong. Appreciate the quick response and the tip for a possible solution!

Enchufa2 commented 1 year ago

Thanks @semmerson for the explanation. No, we currently do not use ut_compare, I'll take a look.

In any case, wouldn't be faster and more accurate if the converter just detected that units are synonyms/aliases and then the value is not modified at all? That is, unless this information is not available internally at the ut_unit level.

semmerson commented 1 year ago

Return a trivial converter if ut_compare() returns 0? Good idea. I'll stick it on the list.

Enchufa2 commented 1 year ago

Great, thanks!

Another question... Shouldn't ut_compare automatically raise an error message if units are not convertible?

semmerson commented 1 year ago

Iñaki,

Another question... Shouldn't ut_compare automatically raise an error message if units are not convertible?

No. ut_are_convertible() is for that determination. ut_compare() is for putting all units in a global order so that, for example, they can be inserted into a binary tree.

--Steve