dgbowl / yadg

yadg: yet another datagram
https://dgbowl.github.io/yadg
GNU General Public License v3.0
35 stars 12 forks source link

Stepdefaults.py failling to set default locale and raising an error #177

Closed axnj2 closed 3 weeks ago

axnj2 commented 3 weeks ago

Error

steps to reproduce :

$ pip install yadg
$ yadg

output (actual path changed to "localPath" for clarity):

Traceback (most recent call last):
  File "localPath/venv/bin/yadg", line 5, in <module>
    from yadg import run_with_arguments
  File "localPath/venv/lib/python3.12/site-packages/yadg/__init__.py", line 3, in <module>
    from .main import run_with_arguments
  File "localPath/venv/lib/python3.12/site-packages/yadg/main.py", line 6, in <module>
    from yadg import subcommands
  File "localPath/venv/lib/python3.12/site-packages/yadg/subcommands.py", line 11, in <module>
    from dgbowl_schemas.yadg import to_dataschema
  File "localPath/venv/lib/python3.12/site-packages/dgbowl_schemas/__init__.py", line 1, in <module>
    from . import dgpost, yadg, tomato
  File "localPath/venv/lib/python3.12/site-packages/dgbowl_schemas/yadg/__init__.py", line 6, in <module>
    from .dataschema_5_0 import DataSchema as DataSchema_5_0, Metadata as Metadata_5_0
  File "localPath/venv/lib/python3.12/site-packages/dgbowl_schemas/yadg/dataschema_5_0/__init__.py", line 16, in <module>
    class DataSchema(BaseModel, extra="forbid"):
  File "localPath/venv/lib/python3.12/site-packages/dgbowl_schemas/yadg/dataschema_5_0/__init__.py", line 25, in DataSchema
    step_defaults: StepDefaults = Field(StepDefaults())
                                        ^^^^^^^^^^^^^^
  File "localPath/venv/lib/python3.12/site-packages/pydantic/main.py", line 193, in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)
  File "localPath/venv/lib/python3.12/site-packages/dgbowl_schemas/yadg/dataschema_5_0/stepdefaults.py", line 38, in locale_set_default
    v = ".".join(locale.getlocale())
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: sequence item 0: expected str instance, NoneType found

the same error is thrown when importing yadg :

import yadg

Environment details

os : macOS 14.5 (23F79) python : v3.12.5 or V3.9.19 (in a venv with only yadg installed) shell : zsh 5.9 (x86_64-apple-darwin23.0)

output of $ locale

LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=

yadg version

yadg 5.1 (also tried the development version from GitHub)

Work around/ solution

Using the latest version of stepdefaults.py from the github of https://github.com/dgbowl/dgbowl-schemas bypasses the issue by setting a default locale if none is found (https://github.com/dgbowl/dgbowl-schemas/blob/master/src/dgbowl_schemas/yadg/dataschema_5_1/stepdefaults.py)

or alternatively hard coding the return value of locale_set_default() to your locale

PS : this was my first time writing an issue. Did I do well ? Any feedback would be appreciated ^^.

PeterKraus commented 3 weeks ago

Thanks for the (great) issue. You're the first user of yadg I know about that uses Mac. Could you let me know which dgbowl-schemas are installed, or paste output of pip freeze?

PeterKraus commented 3 weeks ago

Regardless, I think you're absolutely right - in DataSchema-5.1 we pick up the LC_NUMERIC (which is set), but in DataSchema-5.0 we attempt to parse LC_ALL, which is unset. I will push a fix out.

In the meantime, please try running yadg again after setting LC_ALL to something like "en_US.UTF-8" (using export or whatever the command is in zsh).

axnj2 commented 3 weeks ago

Thank you for the quick response,

here is the output of pip freeze:

annotated-types==0.7.0
appdirs==1.4.4
babel==2.16.0
dgbowl-schemas==117
et-xmlfile==1.1.0
flexcache==0.3
flexparser==0.3.1
h5netcdf==1.3.0
h5py==3.11.0
numpy==2.1.0
olefile==0.47
openpyxl==3.1.5
packaging==24.1
pandas==2.2.2
Pint==0.24.3
pydantic==2.8.2
pydantic_core==2.20.1
python-dateutil==2.9.0.post0
pytz==2024.1
PyYAML==6.0.2
six==1.16.0
striprtf==0.0.26
typing_extensions==4.12.2
tzdata==2024.1
tzlocal==5.2
uncertainties==3.2.2
xarray==2024.7.0
xarray-datatree==0.0.14
yadg==5.1

so the version of dgbowl-schemas is 117.

axnj2 commented 3 weeks ago

Regardless, I think you're absolutely right - in DataSchema-5.1 we pick up the LC_NUMERIC (which is set), but in DataSchema-5.0 we attempt to parse LC_ALL, which is unset. I will push a fix out.

In the meantime, please try running yadg again after setting LC_ALL to something like "en_US.UTF-8" (using export or whatever the command is in zsh).

this worked after setting LC_ALL to "en_US.UTF-8" I got :

$ yadg          
usage: yadg [--version] [--verbose] [--quiet] {process,update,preset,extract} ...
yadg: error: the following arguments are required: subcommand

Which is what I would expect

PeterKraus commented 3 weeks ago

This should be now fixed, try installing the updated dgbowl-schemas==118.

axnj2 commented 3 weeks ago

Thank you !