pangeo-data / xESMF

Universal Regridder for Geospatial Data
http://xesmf.readthedocs.io/
MIT License
189 stars 34 forks source link

Differences in bilinear interpolation between xESMF and CDO/xarray.interp #282

Open BSchilperoort opened 1 year ago

BSchilperoort commented 1 year ago

To avoid any conda dependencies, I am working on writing some regridding methods in xarray (with dask and flox).

However, I noticed a structural difference between the bilinear interpolation as xESMF computes it and other methods. As a reference I have used CDO (Climate Data Operators).

The following plot shows a regridding operation on ERA5 monthly dewpoint temperature data (in Kelvin). The data plotted is (a - b)/a (the relative error). The old resolution is 0.25 degrees, the new resolution is 0.17 degrees.

image

Both the CDO and xESMF use the same Dataset as grid source.


To reproduce:


xarray interpolation code ```py data.interp( coords={ "longitude": lon_coords, "latitude": lat_coords, }, method=method, ) ```
aulemahal commented 1 year ago

Quite interesting!

I think ESMF doesn't work with the lat/lon coordinates, but maybe with their translation into spherical coordinates ? At least, that's what some issues on conservative regridding made me think. Nonetheless, I think you might get a better answer by raising this issue to https://github.com/esmf-org/esmf/ directly (or by email https://earthsystemmodeling.org/support/). xESMF is only a wrapper.

Your work in reimplementing regridding methods is interesting! I often find ESMF to be a bit heavy on dependencies side indeed. But one crucial feature that makes it better than xr.interp is that it computes weights first and then applies them in parallel for all spatial slices. This is a major dealbreaker when dealing with long timeseries! Do you have plans on implementing something like that in pure-xarray ?

BSchilperoort commented 1 year ago

Thanks for your reply. I posted it here because I am not sure if it's an ESMF or xESMF issue. I guess posting it there won't hurt.

I am working on implementing regridding methods in pure xarray (or xarray w/ flox), but note: I am only working with rectilinear data. Currently I have bilinear and nearest-neighbor regridding working, with a surprisingly small amount of code: just using interp works fine.

Bilinear/nearest/cubic interpolation requires no weights at all. By using xr.Dataset's interp method it works on dask arrays allowing for parallel computation, it is significantly faster and more memory efficient than ESMF for this use case.

However, conservative regridding proves to be much more cumbersome.