pytroll / pyresample

Geospatial image resampling in Python
http://pyresample.readthedocs.org
GNU Lesser General Public License v3.0
343 stars 95 forks source link

Add conservative resampler #440

Open zxdawn opened 2 years ago

zxdawn commented 2 years ago

The conservative resampler is

The value of a destination cell is calculated as the weighted sum of the values of the source cells that it overlaps.

according to ESMPy.

It's also supported in xesmf.

If we can add it to pyresample, it would be nice.

djhoese commented 2 years ago

Would you want to use this in satpy? If satpy could access the implementation in xesmf would that be good enough? Would you be using xarray objects with it?

Also, there are some weighted resampling functions already in numpy-based parts of pyresample. Also EWA, which is in pyresample, is a weighted average but requires the data to be scan based (ex. Polar-orbiters).

zxdawn commented 2 years ago

Yes, I wanna use it with satpy. I have run it successfully by converting the Scene to Dataset and regridding it by xESMF. Because xESMF needs specific lon/lat/lon_b/lat_b variables, it's harder to access it in Satpy.

And, I tried to reproduce xESMF results using EWA resampler, but the result looks strange. Here's the comparison notebook

zxdawn commented 2 years ago

Here's the concept of conservative resampler which considers the weighted average:

image

djhoese commented 2 years ago

For EWA, you'd want to play with these parameters:

https://github.com/pytroll/pyresample/blob/74eb088c14c15a524721b9e896073744f4234133/pyresample/ewa/dask_ewa.py#L512-L519

Set weight_distance_max to 2.0 and weight_delta_max to 40.0 as a starting point. That's what I use for VIIRS data.

EWA definitely only uses the "center" points of the pixels. It assumes a pixel size based on the curvature of the earth and the rows per scan. So if tropomi isn't scan based then what you've done with setting rows_per_scan to the total number of rows is "good enough", but also means EWA is less likely to make pretty results compared to other algorithms.

We don't have any resampling algorithms in pyresample that use other information beyond the center of the pixel. I think the best route forward would be to add functionality in satpy to allow for this resampler to be used from xesmf. If you could make it have nice defaults for the bounding coordinate information so that sensors without this information could still use it that'd be really cool, but an exception being raised would be fine with me too.

zxdawn commented 2 years ago

Got it. For the new conservative resampler, here's my idea: If lon_b and lat_b are not available, we can create them following my notebook method which uses xgcm. I suppose satpy-supported datasets usually just have center points?

Oh ... Maybe it's better to make xESMF create the lon_b and lat_b automatically if they don't exist. Then, it would be easy to call it in pyresample.

djhoese commented 2 years ago

If it makes mathematical sense to compute the bounds from the center points then sure that seems interesting. I see your xesmf issue above. Another option is we could make SwathDefinitions be smarter about additional metadata (like rows_per_scan or bounding information) and have methods on it for returning or generating that information from the information contained in the SwathDefinition.

ghiggi commented 2 years ago

I was just working these days on something related to that and I have made the following thoughts.

I wonder if we could add methods to AreaDef and SwathDef to retrieve:

In addition to being useful for conservative remapping with xESMF, these methods could be used to develop the following ones:

For pyresample structured grids, the corner and bounds would be guessed assuming equal spacing on either side of the coordinate labels. Then the question would be if to provide the option to infer the spacing in

What do you think?

zxdawn commented 2 years ago

The idea of get_lonlats_corners() and get_lonlats_bounds() is good. I tried to dig deeper how xgcm calculates the boundary for 2D coordinates ... And, I found C-grid but then lost in the codes .... cc @jbusecke

djhoese commented 2 years ago

@ghiggi When you say "For pyresample structured grids", do you mean an AreaDefinition? Then I think any spacing decisions should be in the area definitions projection space. For SwathDefinitions I think you have to do it in geocentric.

ghiggi commented 2 years ago

@djhoese ... it was my natural mental separation from resampling approaches related to spherical unstructured grids. In such a case the (node) coordinate is 1D, and the simplest way to retrieve "a mesh" is to use SphericalVoronoi. But depending on the knowledge of the background spherical sampling, the mesh can be inferred for the Healpix, (Reduced) Gaussian Legendre, Icosahedral, Octahedral and Equiangular samplings. I think it goes outside the pyresample scope. If one day you want to include also the unstructured grids, I have some code for this scope.

I can draft the methods described above in the coming days :D

djhoese commented 2 years ago

Sure. Yeah. I understood that. :confused: <whoosh emoji>

jbusecke commented 2 years ago

The idea of get_lonlats_corners() and get_lonlats_bounds() is good. I tried to dig deeper how xgcm calculates the boundary for 2D coordinates ... And, I found C-grid but then lost in the codes .... cc @jbusecke

I have just mentioned that in https://github.com/xarray-contrib/cf-xarray/issues/71 (which might have some interesting overlap with this disussion? Even though I am not sure CF conventions are general enough for the problems described here), but I think the ability to infer grid cell bounds/corners(nodes) should really never have been in xgcm (blame a green OSS contributor named Julius for not realizing that back then...).

Xgcm should consume fully defined finite grids, not produce/infer them. As I mentioned over in the above issue, I am planning on removing the 'autogenerate' functionality from xgcm, but a general replacement would be very beneficial it seems for the broader community.