Closed 28raining closed 1 year ago
Thanks for the report.
According to the netcdf standard:
"Names that have trailing space characters are also not permitted."
See: https://docs.unidata.ucar.edu/nug/current/netcdf_data_set_components.html
Maybe you could add a quick cleanup script before writing?
I have done that. I have over 100 variables so to figure it was a trailing space was time consuming.
And as in other ticket, how many of these random gotchas are there.
At least the error message needs to be better. Ideally the tool should just handle it; either to_netcdf cleans up, or to_xarray cleans up
This error is raised by netCDF4, xarray doesn't care about the variable names, they don't even need to be strings.
You could open an issue with netCDF4 to improve the error message. I agree that it should at least contain the invalid name and possibly even the violated rule.
Ideally the tool should just handle it; either to_netcdf cleans up, or to_xarray cleans up
to_xarray
shouldn't clean it up because spaces (and forward slashes) are (currently) valid characters in xarray variable names.
to_netcdf
probably shouldn't automatically clean it up either because then you will silently write a file with different metadata than specified, which would break round-tripping.
What we could do is raise a better error message (as @headtr1ck says), which ideally would be implemented upstream in netCDF4 but could also be implemented as a check within the corresponding xarray backend, similar to https://github.com/pydata/xarray/pull/7953.
@28raining if you want to open a pull request to add an error message like https://github.com/pydata/xarray/pull/7953 does then I think that would be welcome (others please point out if you see an issue with that).
I totally agree.
In general I would advise against implementing the same checks on xarray side, this just leads to divergent behavior in the future. But in this case, I would say that the netcdf rules are put in stone and additionally we can advice users to try a different backend that allows invalid netcdfs.
I've opened a PR upstream to at least make the error messages more helpful: https://github.com/Unidata/netcdf4-python/pull/1268.
Suggestions welcome on further improvements!
What happened?
If variable name ends in a ' ' (space) then to_netcdf crashes.
In my opinion
This is related to https://github.com/pydata/xarray/issues/7943
Our tool converts csv files to XARRAY. So these kind of errors need to self-heal, otherwise we have to feedback to the user and get them to change the csv file. Which would be frustrating for everyone
What did you expect to happen?
No response
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
No response
Environment