pangeo-data / xESMF

Universal Regridder for Geospatial Data
http://xesmf.readthedocs.io/
MIT License
189 stars 34 forks source link

periodic = True not having desired effect #274

Closed jgiguereCC closed 1 year ago

jgiguereCC commented 1 year ago

Hi! I'm trying to regrid some CMIP6 ocean data (in native ocean curvilinear grid) to a lat lon 1.5x1.5 grid. However I'm getting missing data along the boundary of the original data, that setting periodic = True doesn't fix. Is this intended behavior due to how the original data is structured, or something else? I have a pretty limited background in using xesmf and regridding in general. I've tried playing around with the method, but that doesn't seem to have an effect.

Versions are:

intake = 0.6.7
xesmf = 0.6.3
xmip = 0.7.1

Thanks!!

import intake
from xmip.preprocessing import combined_preprocessing
import xarray as xr
import xesmf as xe
import dask
import numpy as np

url = "https://storage.googleapis.com/cmip6/pangeo-cmip6.json"
col = intake.open_esm_datastore(url)

query = dict(experiment_id = "piControl",
             variable_id="tos",
             table_id = "Oday",
             member_id = "r1i1p1f1",
             source_id = "CMCC-CM2-SR5"
            )
dat = col.search(**query)
correct_order = list(dat.df.columns)
dat.esmcat._df = dat.df.groupby(['source_id','experiment_id']).first().reset_index()[correct_order]

z_kwargs = {'consolidated': True, 'decode_times':False}
with dask.config.set(**{'array.slicing.split_large_chunks': True}):
    dset_dict = dat.to_dataset_dict(zarr_kwargs=z_kwargs, preprocessing=combined_preprocessing)

dset = dset_dict['CMIP.CMCC.CMCC-CM2-SR5.piControl.Oday.gn']
dset

image


dset.isel(time = 0, member_id = 0, dcpp_init_year = 0).tos.plot()
Screenshot 2023-06-15 110343
latres = 1.5
lonres = 1.5

CMgrid = xr.Dataset(
    {
        "lat": (["lat"], np.arange(-90, 90, latres)),
        "lon": (["lon"], np.arange(0, 360, lonres)),
    }
)
regridder = xe.Regridder(
    dset, CMgrid, 'bilinear', periodic=True, ignore_degenerate = True
)
regridded = regridder(dset)
regridded.tos.isel(time = 0).plot()
image
aulemahal commented 1 year ago

Hi @jgiguereCC! Thanks for the detailed issue and MWE.

I didn't have easy access to xmip in my current environment, so was not able to imitate your code exactly, but I think xESMF is doing it's job correctly. The problem originates from your data, look at the left most pixels in your first plot : they're all nans.

This can be seen in a clearer way if we "roll" the data, so that the periodic jump in j falls in the middle of plot:

dset.tos.roll(j=180).isel(time=0).plot()

image

My intuition is that this line of NaNs is simply regridded to your output.

A few solutions:

  1. Clip the input data: dset = dset.isel(j=slice(1, -1))
  2. Assign a mask to the input data : ds.assign(mask=ds.tos.isel(time=0, member_id=0).notnull())
  3. Ignore NaNs at regridding time : regridded = regridder(dset, skipna=True)

Best is 1), but only if you know exactly what slices are really invalid. 2) Is dangerous, as it will assign values to the continents too. 3) Works because the line is thin enough.

jgiguereCC commented 1 year ago

ahh thank you! that makes sense