xarray-contrib / cf-xarray

an accessor for xarray objects that interprets CF attributes
https://cf-xarray.readthedocs.io/
Apache License 2.0
157 stars 39 forks source link

More lenient guess_coord_axis() rules? #464

Open kthyng opened 1 year ago

kthyng commented 1 year ago

Hi. I regularly find myself wishing guess_coord_axis() would find more dimension and coordinate names. I was just adding some to do a PR but I see that some are purposely excluded. For example I would like it to find "lat" when the variable name doesn't start with "lat" but it is just in the name somewhere. I see this was purposefully done because some people must have datasets that use "nlat" as a dimension to count their latitudes but want it to be "Y" not "latitude". Same with lon, time, and Z.

I also would like to add "eta" and "xi" to the guess regex for "Y" and "X" respectively to pick up dimensions in ROMS files. However, perhaps this would cause a problem for some users if they have other variable names with those names in them.

Would it make sense to have a flag for levels of leniency for guesses? Or for whether they have to be at the start of the variable name or not?

I can submit the code I have if this is too abstract to discuss. Thanks.

kthyng commented 1 year ago

@dcherian What do you think about this? My hope is to avoid needing to change e.g. eta_rho to a coordinate and add the attribute axis="Y" for all ROMS output.

dcherian commented 1 year ago

For ROMS I would add a SGRID variable.

cf_xarray.datasets.sgrid_roms["grid"]

has all the metadata you need and it's the same for all ROMS simulations AFAICT

dcherian commented 1 year ago

This could be a good addition to the FAQ: https://cf-xarray.readthedocs.io/en/latest/faq.html

Add a grid_topology variable to allow referring to variables without actual values.