geoschem / gcpy

Python toolkit for GEOS-Chem. Contains basic plotting scripts, plus the suite of GEOS-Chem benchmarking utilities.
https://gcpy.readthedocs.io
Other
51 stars 24 forks source link

Update plot.py for more recent xarray; also allow dask arrays to be passed to single_panel, six_plot routines #257

Closed yantosca closed 1 year ago

yantosca commented 1 year ago

This PR fixes an issue that seems to have been introduced with recent versions of xarray. The following updates were made:

(1) The following code in routine get_extents_for_color (in gcpy/plot.py):

            return ds_new.where(\
                ds_new[lon_var] >= minlon, drop=True).\
                where(ds_new[lon_var] <= maxlon, drop=True).\
                where(ds_new[lat_var]>= minlat, drop=True).\
                where(ds_new[lat_var] <= maxlat, drop=True)

needed to be changed to

            # Add .compute() to force evaluation of ds_new[lon_var]
            # See https://github.com/geoschem/gcpy/issues/254
            # Also note: This may return as a dask.array.Array object
            return ds_new.where(\
                ds_new[lon_var].compute() >= minlon, drop=True).\
                where(ds_new[lon_var].compute() <= maxlon, drop=True).\
                where(ds_new[lat_var].compute() >= minlat, drop=True).\
                where(ds_new[lat_var].compute() <= maxlat, drop=True)

as calling where with drop=True on an xarray object silently evaluates the data. Using .compute() forces xarray to do the actual computation. This behavior seems to have changed in xarray recently. For a similar issue, see: https://github.com/hainegroup/oceanspy/issues/332. The object returned also seems to be of type dask.array.Array instead of xarray.DataArray or numpy.ndarray.

(2) We now must add this import statement;

from dask array import Array as DaskArray

so that we can add this to calls to verify_variable_type.

(3) We must now also add DaskArray to the calls to verify_variable_type in six_plot and single_panel in plot.py:

    verify_variable_type(plot_val, (np.ndarray, xr.DataArray, DaskArray))

(4) Update Pydoc headers accordingly:

        """
        ... etc ...

        plot_vals: xarray.DataArray, numpy.ndarray, or dask.array.Array
            Single data variable GEOS-Chem output to plot

        ... etc ...
        """

(5) Because these fixes allow benchmark plots to proceed, we can remove the pegged xarray from environment.yml

    #
    # NOTE: The most recent xarray (2023.8.0) seems to break backwards
    # compatibility with the benchmark plotting code.  Peg to 2023.2.0
    # until we can update GCPy for the most recent xarray.
    #  -- Bob Yantosca (29 Aug 2023)
    #
    - xarray==2023.2.0                # Read data from netCDF etc files

and replace it with

    - xarray                          # Read data from netCDF etc files```
msulprizio commented 1 year ago

This fix resolves the error reported in #254.