tobac-project / tobac

Tracking and object-based analysis of clouds
BSD 3-Clause "New" or "Revised" License
103 stars 54 forks source link

Xarray/Iris incompatibilities around integer arrays #418

Closed freemansw1 closed 8 months ago

freemansw1 commented 8 months ago

When converting between xarray.DataArray and iris.cube.Cube for integer arrays, a bug in xarray (according to @w-k-jones prior to 2023.06) causes an exception to be raised. @w-k-jones has provided a pretty detailed description of the issue here: https://github.com/tobac-project/tobac/pull/378#issuecomment-1974796854 .

I'm moving this to a new issue so that we can figure out what to do, which we will for some examples that @w-k-jones has added with #378 and with the xarray transition for segmentation that I have worked on with #417.

As far as I see it, we have the following options:

  1. Pin our xarray version requirements to the non-offending versions (it would be good to go back and see if we can allow any older versions)
  2. Find a way to work around the issue for conversions
  3. Skip ahead to v2.0 and ditch iris entirely without a deprecation period (okay, okay, I can dream)
w-k-jones commented 8 months ago

I think I've handled this in the past by replacing the masked integer array with a filled array in the iris cube before converting to xarray, but I can't find the notebook at the moment. I will try to recreate it as a proof of concept

w-k-jones commented 8 months ago
def convert_cube_to_dataarray(cube):
    """
    Convert an iris cube to an xarray dataarray, averting error for integer dtypes in xarray<v2023.06

    Parameters
    ----------
    cube : iris.cube.Cube
        Iris data cube

    Returns
    -------
    dataarray : xr.DataArray
        dataarray converted from cube. If the cube's core data is a masked array and has integer dtype,
        the returned datarray will have a numpy array with masked values filled with the minimum value for
        that integer dtype. Otherwise the data will be identical to that produced using xr.DataArray.from_iris
    """
    if isinstance(cube.core_data(), ma.core.MaskedArray) and np.issubdtype(cube.core_data().dtype, np.integer):
        return xr.DataArray.from_iris(cube.copy(cube.core_data().filled(np.iinfo(cube.core_data().dtype).min)))
    return xr.DataArray.from_iris(cube)

This should produce results identical to xarray>=v2023.06 for older versions, and without the warning message

w-k-jones commented 8 months ago

I have now added a fix to this as part of #378

w-k-jones commented 8 months ago

Fixed with #378