pangeo-data / xESMF

Universal Regridder for Geospatial Data
http://xesmf.readthedocs.io/
MIT License
188 stars 34 forks source link

When to apply `add_nans_to_weight` #56

Closed aulemahal closed 3 years ago

aulemahal commented 3 years ago

In the case where dsout doesn't need a mask (like it would be ones everywhere), but still has points outside the domain of dsin, the current way to get nans instead of 0 is by adding a useless mask. This feels slightly hacky, could there be a kwarg controlling this behavior? Like add_nans_to_weight=True, or something in the like.

Also, I'm not sure I understand why someone would want 0s in the output, shouldn't the nan-filling behavior be default?

Happy to make a small PR if needed, following the discussion.

huard commented 3 years ago

I think add_nans_to_weight could always be the default behavior. This seems consistent with discussions in https://github.com/pangeo-data/xESMF/issues/29#issuecomment-742371696

Please submit a PR and a notebook example demonstrating the issue.

sol1105 commented 3 years ago

https://nbviewer.jupyter.org/github/roocs/regrid-prototype/blob/main/docs/notebooks/xESMF_add_matrix_NaNs.ipynb In the above notebook I hope the issue is demonstrated.

The following is a notebook created for a different purpose, but one can see the application of add_nans_to_weights when remapping data from a regional to a global grid: https://nbviewer.jupyter.org/github/roocs/regrid-prototype/blob/main/docs/notebooks/xESMF_curvilinear.ipynb

I suggest to apply add_nans_to_weights as default for all but the nearest neighbour methods. Additionaly one could introduce a keyword for the Regridder, in case someone does not want the modification of the ESMF generated weights?

For the nearest neighbour methods the add_nans_to_weights function is no solution to have out-of-source-domain cells masked in the target grid. An option could be to define a maximum distance (eg. 2x the average grid spacing) up to which a mapping to nearest cells may happen. Is there already such a possibility/option in ESMF?

stefraynaud commented 3 years ago

As suggested @huard , you are welcome to make PR Martin!

sol1105 commented 3 years ago

I made a PR to set add_nans_to_weights as the default to always mask unmapped / out-of-source-domain grid cells of the target grid (#94).

Regarding my comment above on how to allow masking such grid cells also for the nearest_s2d/d2s methods, I made a notebook on how it could look: https://nbviewer.jupyter.org/github/roocs/regrid-prototype/blob/main/docs/notebooks/xESMF_curvilinear_nearest_mask_unmapped.ipynb

Please give your opinion whether it is useful to implement this as feature of xESMF (eg. as default behaviour if nearest_* is selected as method and yet no output mask is defined) and some feedback if this approach makes sense or what could be improved (I got some inspiration from here: https://github.com/FESOM/pyfesom/blob/master/pyfesom/regriding.py). If I overlooked a similar feature in ESMF / ESMPy that would be also very helpful information :)