xarray-contrib / cf-xarray

an accessor for xarray objects that interprets CF attributes
https://cf-xarray.readthedocs.io/
Apache License 2.0
155 stars 39 forks source link

Discovering Axes for ROMS #84

Closed kthyng closed 3 years ago

kthyng commented 4 years ago

I am trying to use this excellent package but it doesn't seem to work for ROMS output. What is the best way to use this when the input names are apparently not expected? Here is what I get:

for full Dataset: image

For single DataArray: image

For the T axis, ocean_time should be straight-forward since there is only one. For vertical coordinates, there are actually a bunch in my Dataset for all different grid configurations: image

dcherian commented 4 years ago

But it's discovered a lot! you can use 'latitude', 'longitude', 'vertical', 'Z' at this point.

ds = ds.cf.guess_coord_axis(verbose=True) will make some guesses (and tell you why) based on regexes that match variable name and datetime64 / cftime objects. This won't add the axis:X, axis:Y, axis:Z attributes for this dataset. But you could do this in xroms for example.

I'm happy to merge more heuristics for guess_coord_axis if you have ideas.

dcherian commented 4 years ago

Sorry I missed the bit about z_*

I think we could try to mark z_ variables as Z axis by slightly changing this regex if it doesn't work already. Then guess_coord_axis will add axis:Z to all those variables.

https://github.com/xarray-contrib/cf-xarray/blob/235acac4771597fbd0fe262962450cc1b17ac48c/cf_xarray/accessor.py#L109-L112

Related: There's some discussion about X vs longitude and Y vs latitude here: https://github.com/xarray-contrib/cf-xarray/issues/23

EDIT: (also i love that you're using this with ROMS; i've been testing with MOM6 mostly so this is a great way to generalize things a bit more.).

kthyng commented 4 years ago

I didn't mean to imply nothing is working, sorry! It looked to me like the Axes labels are important for what I am trying to do, but I am also still learning what the objectives of cf-xarray are so that I can use it properly. I hate having to specify the grid that I am on when I am doing unambiguous operations to Dataarrays, and it looks like cf-xarray can help me with this, but it also looks like I need the Axes properly labeled for that. I see that having the coordinates pulled out like they already are will allow me to plot without having to specify a different staggered grid for each variable, so that is great.

I'm fine with X, Y, Z for dimensions vs. lon, lat for coordinates, though ROMS people use xi and eta instead of X and Y so having the choice between X, Y and xi, eta might be good.

How is time identified? This doesn't seem to work for either Axes or coords so far (name is ocean_time).

I tried ds.cf.guess_coord_axis(verbose=True) but it borked with RecursionError: maximum recursion depth exceeded while calling a Python object.

kthyng commented 4 years ago

Oh, and I think it is fine to have xroms set up things that don't make sense to go into cf-xarray, but 1. I am not clear where that line is and 2. could you point me to any examples for how to set it up by hand for that purpose?

kthyng commented 4 years ago

Ah, sorry for posting again, but I see you probably mean to modify attributes of each variable as described here: https://cf-xarray.readthedocs.io/en/latest/examples/introduction.html#What-attributes-have-been-discovered?

dcherian commented 4 years ago

I didn't mean to imply nothing is working, sorry!

I didn't take it to mean that. no worries!

I'm fine with X, Y, Z for dimensions vs. lon, lat for coordinates, though ROMS people use xi and eta instead of X and Y so having the choice between X, Y and xi, eta might be good.

OK cf_xarray works by interpreting CF attributes so these attributes need to be set. (guess_coord_axis will set these attributes by looking for some common patters). Here it looks like the lon_*variables are tagged with attrs["standard_name"] = "longitude" or attrs["units"] = "degrees_north" for example.

The "CF" compliance bit rules out using xi, eta but I think what would be most useful here is to set appropriate attrs for xi_rho etc in xroms

ds.coords["xi_rho"] = ds.xi_rho  # makes coordinate variable so we can set attributes; weird syntax
ds.xi_rho.attrs["axis"] = "X"

I tried ds.cf.guess_coord_axis(verbose=True) but it borked with RecursionError: maximum recursion depth exceeded while calling a Python object.

This is a bug. It should detect that ocean_time is of type datetime64[ns] and set ocean_time.attrs["axis"] = "T", if you can figure where it's going wrong that would be very helpful.

dcherian commented 4 years ago

image

Looks like this works for ocean_time.

The tagging of xi_rho won't work this way, so we avoid doing that for "unindexed variables".

dcherian commented 3 years ago

Closing. Feel free to repoen if there are other things not working.

kthyng commented 3 years ago

Quick question:

Is guess_coord_axis meant to be used to add in necessary attributes to be recognized by cf-xarray? For example, I could put at the end of a function where attributes might have been lost:

return da.cf.guess_coord_axis()

and maybe this would save some of the lines I've been adding to get attributes added?

dcherian commented 3 years ago

Is guess_coord_axis meant to be used to add in necessary attributes to be recognized by cf-xarray?

:+1: Yes

maybe this would save some of the lines I've been adding to get attributes added?

If you can trust the guessing heuristics. A more long term fix would be to report these attribute losses upstream somewhere (I've been slowly working on fixing them).

dcherian commented 1 year ago

We now have SGRID support, so xi_* and eta_* don't need to have values associated with them anymore as long as the grid_topology variable is present.