SciTools / iris

A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data
https://scitools-iris.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
625 stars 282 forks source link

Unify regridders #4754

Open stephenworsley opened 2 years ago

stephenworsley commented 2 years ago

📰 Custom Issue

There are a collection of regridders in iris which each work slightly differently, but are similar enough that they could potentially benefit from unifying some of their behaviour and perhaps have them inherit from the same abstract class. This would also benefit regridders built outside of iris such as in iris-esmf-regrid. The current set of iris regridding schemes is:

There is also the deprecated regridder in iris.experimental.regrid_conservative which is functionally replaced by iris-esmf-regrid.


The current set of regridders have the following quirks:

Linear and Nearest

AreaWeighted

UnstructuredNearest

PointInCell


Suggestions

Currently, the most sophisticated regridder for metadata handling is RectilinearRegridder. The problem seems to be that other regridders have slightly different use cases so they can't fully take advantage of the infrastructure in RectilinearRegridder. These regridders could be improved by having them derive from a regridder class which is more generic than RectilinearRegridder. This could also help the creation of future regridders (like in iris-esmf-regrid). Some ideas for how this regridder might be structured:

_AbstractRegridder

Rough proposed common structure of regridders:

The following is copied from a comment in #4807 where it demonstrates a structure to aim for which would allow the possibility to derive from a common class. The aim here is to bring all regridder closer to this structure, improving their functionality along the way, and then refactor this structure in terms of class inheritance when that becomes possible.


def regrid(data, dims, regrid_info):
  ...
  return new_data

def create_cube(data, src, src_dims, tgt_coords, callback):
  ...
  return result_cube

def _prepare(src, tgt):
  ...
  return regrid_info

def _perform(cube, regrid_info):
  ...
  dims = get_dims(cube)
  data = regrid(cube.data, dims, regrid_info)
  callback = functools.partial(regrid, regrid_info=regrid_info)
  return create_cube(data, src, dims, tgt_coords, callback)

class Regridder:
  def __init__(src, tgt):
    ...
    self._regrid_info = _prepare(src, tgt)
  def __call__(src):
    ...
    return _perform(src, self._regrid_info)

Sub-tasks/Side-tasks

Aside from the larger task of refactoring all the regridders to derive from an abstract class, there are several smaller tasks which could help to unify the behaviour of regridders. Bear in mind some of these tasks may have redundancy with the larger refactoring task.

stephenworsley commented 2 years ago

An example of _create_cube being written in a generic way that supports multiple different dimensionalities can be found in iris-esmf-regrid. Especially in this PR, also concerned with unifying regridders https://github.com/SciTools-incubator/iris-esmf-regrid/pull/175.

pletzer commented 2 years ago

Thanks @stephenworsley for raising this issue. There would be great benefit in having a uniform API for regridding and interpolation. My take on this:

  1. Let's distinguish between 3 types of operations: projection, interpolation and regridding. a. Projection is the computation of line integrals, fluxes and volume integrals. The target is a point, line, area or volume, depending on the field staggering. b. Interpolation is the projection of the field onto a point. c regridding is projection applied to grid elements (nodes, edges, faces and cell volumes)
  2. The mesh. It can be unstructured, structured or rectilinear. The distinction is important because projection, interpolation and regridding require one to quickly find the cell that contains a point. This search is very fast for an uniform grid, but can become slow for a general structured grid and even slower for an unstructured grid. While every structured grid can be represented as an unstructured grid and every structured grid as an unstructured grid, there is a clear advantage in using the method most suited for a grid.
  3. The field staggering (nodal, edge, face or cell). Each staggering requires its own projection/interpolation/regridding method.
  4. The dimensionality of space, 3d, 2d, 1d, 2d + vertical axis, ....

From this perspective we can classify each of the existing projection/interpolation and regridding methods. Eg, UnstructuredNearest for the interpolation of a nodal field on an unstructured grid, RectilinearRegridder for regridding of a nodal field, .... Ideally we want all the methods implemented for 2d/3d, different staggerings and mesh types with dispatching the operation to the "right" method based on the grid/field metadata.