xoceanmodel / xroms

Work with ROMS ocean model output with xarray
https://xroms.readthedocs.io/
MIT License
61 stars 38 forks source link

Dimension failure in roms_dataset #69

Open kyle-hinson opened 5 months ago

kyle-hinson commented 5 months ago

Hello, I'm running into an issue when trying to run roms_dataset that I believe may be related to a recent upgrade of my xarray package (version 2024.5.0) and is hopefully useful for others. Below is an example of the relevant code that I'm running.

import xarray as xr
import xroms

ds = xr.open_dataset(file_name, chunks={'ocean_time': 1})
ds = xroms.roms_dataset(ds)

The error that I am encountering is below: ValueError: Dimensions {'X', 'T', 'Y', 'Z'} do not exist. Expected one or more of ('ocean_time', 'eta_rho', 'xi_rho', 's_w') This code worked fine right before I upgraded xarray (which I did because I was having issues reading in datasets with chunking), and seems to be encountering more issues related to xarray functions in the full error message below. My code seems to work without issue for xarray version 2024.3.0 but fails when using xarray version 2024.5.0. Not sure how to manage this in the future with additional package upgrades, but wanted to put it on people's radars. Thanks again, xroms is great to work with!

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[7], line 2
      1 ds = xr.open_dataset(file_name, chunks={'ocean_time': 1})
----> 2 ds, xgrid = xroms.roms_dataset(ds)

File /global/u2/user/mambaforge/envs/mamba_env1/lib/python3.10/site-packages/xroms/xroms.py:243, in roms_dataset(ds, Vtransform, add_verts, proj, include_Z0, include_3D_metrics, include_cell_volume, include_cell_area)
    232     z_rho0.attrs = {
    233         "long_name": "depth of RHO-points",
    234         "field": "z_rho0, scalar",
    235         "units": "m",
    236     }
    237     z_w0.attrs = {
    238         "long_name": "depth of W-points",
    239         "field": "z_w0, scalar",
    240         "units": "m",
    241     }
--> 243 ds.coords["z_w"] = order(z_w)
    244 ds.coords["z_w_u"] = grid_interp(xgrid, ds["z_w"], "X")
    245 # ds.coords["z_w_u"] = xgrid.interp(ds.z_w, "X")

File /global/u2/user/mambaforge/envs/mamba_env1/lib/python3.10/site-packages/xroms/utilities.py:1572, in order(var)
   1546 """Reorder var to typical dimensional ordering.
   1547 
   1548 Parameters
   (...)
   1566 >>> xroms.order(var)
   1567 """
   1569 # this looks at the dims on var directly which tends to be more accurate
   1570 # since the DataArray can have extra coordinates included (but dropping
   1571 # can drop too many variables).
-> 1572 return var.cf.transpose(
   1573     *[
   1574         dim
   1575         for dim in ["T", "Z", "Y", "X"]
   1576         if dim in var.cf.axes and var.cf.axes[dim][0] in var.dims
   1577     ]
   1578 )

File /global/u2/user/mambaforge/envs/mamba_env1/lib/python3.10/site-packages/cf_xarray/accessor.py:693, in _getattr.<locals>.wrapper(*args, **kwargs)
    689 posargs, arguments = accessor._process_signature(
    690     func, args, kwargs, key_mappers
    691 )
    692 final_func = extra_decorator(func) if extra_decorator else func
--> 693 result = final_func(*posargs, **arguments)
    694 if wrap_classes and isinstance(result, _WRAPPED_CLASSES):
    695     result = _CFWrappedClass(result, accessor)

File /global/u2/user/mambaforge/envs/mamba_env1/lib/python3.10/site-packages/xarray/util/deprecation_helpers.py:140, in deprecate_dims.<locals>.wrapper(*args, **kwargs)
    132     emit_user_level_warning(
    133         f"The `{old_name}` argument has been renamed to `dim`, and will be removed "
    134         "in the future. This renaming is taking place throughout xarray over the "
   (...)
    137         PendingDeprecationWarning,
    138     )
    139     kwargs["dim"] = kwargs.pop(old_name)
--> 140 return func(*args, **kwargs)

File /global/u2/user/mambaforge/envs/mamba_env1/lib/python3.10/site-packages/xarray/core/dataarray.py:3058, in DataArray.transpose(self, transpose_coords, missing_dims, *dim)
   3025 """Return a new DataArray object with transposed dimensions.
   3026 
   3027 Parameters
   (...)
   3055 Dataset.transpose
   3056 """
   3057 if dim:
-> 3058     dim = tuple(infix_dims(dim, self.dims, missing_dims))
   3059 variable = self.variable.transpose(*dim)
   3060 if transpose_coords:

File /global/u2/user/mambaforge/envs/mamba_env1/lib/python3.10/site-packages/xarray/namedarray/utils.py:174, in infix_dims(dims_supplied, dims_all, missing_dims)
    172             yield d
    173 else:
--> 174     existing_dims = drop_missing_dims(dims_supplied, dims_all, missing_dims)
    175     if set(existing_dims) ^ set(dims_all):
    176         raise ValueError(
    177             f"{dims_supplied} must be a permuted list of {dims_all}, unless `...` is included"
    178         )

File /global/u2/user/mambaforge/envs/mamba_env1/lib/python3.10/site-packages/xarray/namedarray/utils.py:128, in drop_missing_dims(supplied_dims, dims, missing_dims)
    126     supplied_dims_set = {val for val in supplied_dims if val is not ...}
    127     if invalid := supplied_dims_set - set(dims):
--> 128         raise ValueError(
    129             f"Dimensions {invalid} do not exist. Expected one or more of {dims}"
    130         )
    132     return supplied_dims
    134 elif missing_dims == "warn":

ValueError: Dimensions {'X', 'T', 'Y', 'Z'} do not exist. Expected one or more of ('ocean_time', 'eta_rho', 'xi_rho', 's_w')
kthyng commented 5 months ago

Hi! I'll try to take a look at this next week.

kthyng commented 4 months ago

Hi @kyle-hinson! I investigated with some ROMS output I have available and didn't hit any errors. Here is the code I used, with some variation indicated in comments:

import xarray as xr
import xroms

# loc = "https://thredds.aoos.org/thredds/dodsC/AWS_CIOFS.nc"
loc = "http://xpublish-nwgoa.srv.axds.co/datasets/nwgoa_all/zarr/"
ds = xr.open_zarr(loc, chunks={"ocean_time": 1})
# ds = xr.open_zarr(loc, chunks={})
ds = xroms.roms_dataset(ds)
# with xarray v2024.6.0 didn't hit error

# try 2024.5.0, also ok
# see what happens for 2024.3.0, also ok
# only got warnings from xgcm
# /Users/kthyng/miniconda3/envs/fixxroms/lib/python3.12/site-packages/xgcm/grid_ufunc.py:832: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
#   out_dim: grid._ds.dims[out_dim] for arg in out_core_dims for out_dim in arg

I tried xarray 2024.6.0, 2024.5.0, and 2024.3.0, and two different chunking schemes and I had the same results each time. I started from a fresh environment so it's possible this is due to environment differences, but I think it's more likely due to differences in our model output. The roms_dataset function is meant to account for these differences by adding attributes that might be missing, but it must be missing something. It's possible that our two datasets are going down two different pathways in the function so that mine didn't hit exactly the same set of conditions, too, though all functions eventually hit that order() function to make sure arrays stay in the correct dimensional ordering (which often catches issues with attributes).

I think this will be difficult to debug given this information so I think for now you should go ahead with the older version of xarray that works for you, but please update this issue if you get more ideas or information about what is going on. Thanks for posting.

kthyng commented 4 months ago

@kyle-hinson And let me know if the workaround stops working and this becomes a blocker for you.