ecmwf / earthkit-data

A format-agnostic Python interface for geospatial data
Apache License 2.0
53 stars 10 forks source link

Add method to describe the projection of a field #57

Closed sandorkertesz closed 1 year ago

sandorkertesz commented 1 year ago

On a GRIB field currently we have these methods:

    def proj_string(self):
        return self.proj_target_string()

    def proj_source_string(self):
        return self.handle.get("projSourceString", default=None)

    def proj_target_string(self):
        return self.handle.get("projTargetString", default=None)

The new method called projection() should replace the ones above returning an object providing access to the proj source and proj targets strings as a minimum.

JamesVarndell commented 1 year ago

I have raised a PR to implement this kind of functionality for netCDF data in #33, and intend to do the same thing for GRIB. The idea is to provide a crs() method which returns a cartopy CRS object, which can easily be handled by earthkit-maps or further processed to get things like proj strings, EPSG codes, etc.

Note that this may mean the proj_string methods become obsolete, as this information can be easily extracted from the CRS object.

tlmquintino commented 1 year ago

Do you mean then? crs = d.projection().crs()

JamesVarndell commented 1 year ago

Do you mean then? crs = d.projection().crs()

Not necessarily, but this raises a good point. We have two concepts here (just spelling this out for my own understanding):

  1. A projection, which describes how the earth is mapped to a 2D plane.
  2. A coordinate reference system, which describes how to transform the points on which the data is defined onto real places on the earth.

I think we're likely to need classes representing these concepts for two major use cases: plotting with earthkit-maps and regridding with earthkit-regrid. Right now, earthkit-maps ideally wants something that looks like a cartopy CRS object - but this object is probably possibly not sufficient for earthkit-regrid.

Here's my suggestion:

  1. In the short term, we implement a data.crs() or data.cartopy_crs() method, which returns a cartopy CRS object describing the data's CRS, providing all of the functionality that we need now for plotting. However, I don't think we should make cartopy a dependency of earthkit-data, so you would need to separately install cartopy to use this method.
  2. In the medium term, implement a new class called Projection or similar, probably within earthkit-regrid. This class could provide everything earthkit-regrid needs while also implementing its own to_cartopy_crs() method, and so a data.projection() method would probably supercede the data.cartopy_crs() method in the future.

What do you think @tlmquintino?

tlmquintino commented 1 year ago

That is exactly what I meant. I would start by creating now an abstraction Projection accessible with .projection().

In first step, it will have only one method .to_cartopy_crs(). Later we can add/remove methods to this class as we refine the abstraction with the introduction of the Grid iterator library and with further developing of earthkit-regrid.

No methods about reference frames and projections should be directly in the Data object.

By the way, CRS has other meanings in our field. It can mean Compressed Row Storage of a matrix, which would definetly be computing if you put it in an object that can exactly represent a matrix (as Data does). .projection().crs() makes it unambiguous.

JamesVarndell commented 1 year ago

Here is a notebook demonstrating a first draft of this idea: https://github.com/ecmwf/earthkit-data/blob/feature/projections/docs/examples/projection.ipynb

Here we have a Projection object with the following methods:

At the moment this Projection only supports construction from proj strings, but I'm working on supporting CF "grid mappings" too. Note from cells 6 and 11 that the internal attributes describing the projection do not use PROJ or CF standards. Currently I am using the same internal attributes that cartopy uses, because they are human-readable, independent of input type (PROJ/CF), and very convenient for the to_cartopy_crs method - but I'm very open to changing this in the future.

JamesVarndell commented 1 year ago

Implemented in #105

sandorkertesz commented 1 year ago

@JamesVarndell, can we now remove the proj_sting(), proj_target_string(), proj_source_string() ... methods from GribField?

tlmquintino commented 1 year ago

@JamesVarndell, can we now remove the proj_sting(), proj_target_string(), proj_source_string() ... methods from GribField?

I think we should.