SciTools / cf-units

Units of measure as required by the Climate and Forecast (CF) Metadata Conventions
https://cf-units.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
64 stars 46 forks source link

Locale issue #435

Open bouweandela opened 2 months ago

bouweandela commented 2 months ago

🐛 Bug Report

cf_units fails to import when the LC_NUMERIC environmental variable is set to a locale that has a , instead of a . as the decimal_point character and Python sets the locale. This typically happens in documentation builds with sphinx. See https://github.com/pydata/xarray/issues/4257 for an example. I have encountered the same issue when building the iris and ESMValCore documentation.

How to Reproduce

Steps to reproduce the behaviour:

  1. Run export LC_NUMERIC=nl_NL.UTF-8
  2. View locale settings by running locale:
    LANG=en_US.UTF-8
    LANGUAGE=
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC=nl_NL.UTF-8
    LC_TIME=nl_NL.UTF-8
    LC_COLLATE="en_US.UTF-8"
    LC_MONETARY=nl_NL.UTF-8
    LC_MESSAGES="en_US.UTF-8"
    LC_PAPER=nl_NL.UTF-8
    LC_NAME=nl_NL.UTF-8
    LC_ADDRESS=nl_NL.UTF-8
    LC_TELEPHONE=nl_NL.UTF-8
    LC_MEASUREMENT=nl_NL.UTF-8
    LC_IDENTIFICATION=nl_NL.UTF-8
    LC_ALL=
  3. Run the following Python code to set the locale

    import locale
    locale.setlocale(locale.LC_ALL, '')
    import cf_units
  4. Resulting stack trace:
    
    Traceback (most recent call last):
    File "/home/bandela/src/scitools/cf_units/cf_units/__init__.py", line 187, in <module>
    _ud_system = _ud.read_xml()
                 ^^^^^^^^^^^^^^
    File "cf_units/_udunits2.pyx", line 194, in cf_units._udunits2.read_xml
    return wrap_system(csystem)
    File "cf_units/_udunits2.pyx", line 104, in cf_units._udunits2.wrap_system
    _raise_error()
    File "cf_units/_udunits2.pyx", line 184, in cf_units._udunits2._raise_error
    raise UdunitsError(status, errnum)
    cf_units._udunits2.UdunitsError: UT_PARSE

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/bandela/src/scitools/cf_units/cf_units/init.py", line 190, in _ud_system = _ud.read_xml(config.get_xml_path()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "cf_units/_udunits2.pyx", line 194, in cf_units._udunits2.read_xml return wrap_system(csystem) File "cf_units/_udunits2.pyx", line 104, in cf_units._udunits2.wrap_system _raise_error() File "cf_units/_udunits2.pyx", line 184, in cf_units._udunits2._raise_error raise UdunitsError(status, errnum) cf_units._udunits2.UdunitsError: UT_PARSE

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/bandela/src/scitools/cf_units/test.py", line 4, in import cf_units File "/home/bandela/src/scitools/cf_units/cf_units/init.py", line 193, in raise OSError( OSError: [UT_PARSE] Failed to open UDUNITS-2 XML unit database


## Expected Behaviour
<!-- A clear and concise description of what you expected to happen -->

A successful import.

## Environment 
 - OS & Version: Ubuntu 24.04 LTS
 - cf-units Version: 3.2.1.dev57

## Additional Context
When I comment out the `suppress_warnings` context manager here https://github.com/SciTools/cf-units/blob/f57a7f1fdba4e9424e4dbbc01b614e521a88086c/cf_units/__init__.py#L182
I see the following error message:

Invalid numeric prefix value ".1" parsing aborted File "/home/bandela/mambaforge/envs/cf-units-dev/share/udunits/udunits2-prefixes.xml", line 43, column 25 parsing aborted File "/home/bandela/mambaforge/envs/cf-units-dev/share/udunits/udunits2.xml", line 11, column 42 Traceback (most recent call last): File "/home/bandela/src/scitools/cf_units/test.py", line 4, in import cf_units File "/home/bandela/src/scitools/cf_units/cf_units/init.py", line 187, in _ud_system = _ud.read_xml() ^^^^^^^^^^^^^^ File "cf_units/_udunits2.pyx", line 194, in cf_units._udunits2.read_xml return wrap_system(csystem) File "cf_units/_udunits2.pyx", line 104, in cf_units._udunits2.wrap_system _raise_error() File "cf_units/_udunits2.pyx", line 184, in cf_units._udunits2._raise_error raise UdunitsError(status, errnum) cf_units._udunits2.UdunitsError: UT_PARSE


some [searching trough the udunits2 code](https://github.com/search?q=repo%3AUnidata%2FUDUNITS-2+%22Invalid+numeric+prefix+value%22&type=code) suggests that it originates from using the [`strtod`](https://en.cppreference.com/w/c/string/byte/strtof) function in [this code](https://github.com/Unidata/UDUNITS-2/blob/c83da987387db1174cd2266b73dd5dd556f4476b/lib/xml.c#L1731-L1742), which uses the `decimal-point` locale setting.
bouweandela commented 2 months ago

I had a go at fixing this in #436, but there appear to be some issues with CI.