pytroll / pyresample

Geospatial image resampling in Python
http://pyresample.readthedocs.org
GNU Lesser General Public License v3.0
349 stars 94 forks source link

Proposal: New area file format #57

Open djhoese opened 7 years ago

djhoese commented 7 years ago

This is a proposal for pyresample to support a new file format to get area definitions from. There are a lot of decisions that could go in to something like this so I wanted to make an issue before I threw something together and have everyone hate it. Depending on what gets decided the current format may stay around, but my hope is that the new format is so good no one will want to use the old format anymore. Also this new format may be made most flexible if there is a method or function or class that provides most of the convenience that the configuration provides (see below). What this would mean for the current classes and methods, I'm not sure.

I've been thinking about this for a while now during satpy development because satpy will be heavily used by Polar2Grid and therefore Polar2Grid's users. Polar2Grid has its own comma-separated way of specifying "grids" and I'd like to use whatever format satpy uses, but I'm not really a fan of the current format nor does the current format provide all of the features that P2G's provides. These features revolve around the idea of convenience for the user and allowing them to enter what they want or what they know and having logical defaults fill in the gaps. In P2G's case, some users don't know what projections are or don't know that meters in LCC are different than meters in Mercator.

NOTE: All areas have a name and an optional description. The name must be unique among all other areas in the file.

Ways Areas could be specified but not limited just to these:

  1. Extents - Outer pixel edge coordinates in meters (legacy spec) a. PROJ.4 String (or dictionary, debatable) b. 4-element list of extents in projection units (same as current format) c. number of columns and rows in pixels

  2. Geotiff specification a. PROJ.4 string or dict b. upper-left origin (X, Y) in projection units c. Cell size (X, Y) in projection units (a.k.a. pixel resolution) d. number of columns and rows in pixels

  3. Circle - resulting area is still rectangular a. PROJ.4 string or dict b. center coordinate (X, Y) in projection units c. radius in projection units

  4. Area of interest - Geotiff based on center point a. Center coordinate (X, Y) in projection units b. Cell size (X, Y) in projection units c. number of columns and rows in pixels d. PROJ.4 string or dict

  5. Swath Definition - Maybe? a. binary longitude file (.npz is numpy array, .dat is float32 binary, etc?) b. binary latitude file

In addition to this I think it should be possible for a user to specify anything that is in "projection units" in degrees except for maybe cell size and radius. I've often helped people who started using pyresample and one of their first questions is "What are these 4 numbers for extents? How do I calculate them?". Allowing for someone to enter degrees would help with that.

Another feature would be logical defaults. For example, in number 3 or 4 the PROJ.4 could be optional and a "best projection" could be chosen based on the absolute value of the latitude of the center point (ex. mercator for < 15, lcc for <60, polar stereographic otherwise).

Another feature that would require support inside of pyresample would be "dynamic grids" as they are called in Polar2Grid. These are grids where things like origin or number of cols/rows or even cell size is determined by the data being resampled. Based on my experience in Polar2Grid with this an important thing that can be overlooked is that datasets that are being resampled for the same geographic scene (time and location) should usually share the same resolved dynamic Area. Most likely this resolved Area used the highest resolution input swath definition to determine things. This is very important if the resampled results are going to be compared or merged in to a composite. In Polar2Grid this resolution step of dynamic areas is hidden inside the resampling code. This probably wouldn't work best for pyresample since it limits what the user can do with the resulting area.

Lastly, what format would best serve something like this:

  1. YAML (my vote)
  2. INI
  3. Other?
pnuu commented 7 years ago

My first very-very-late-night-two-pennies:

djhoese commented 7 years ago

Point 1: Agreed. Although for testing or for the rare case that you want an image in the original sensor space the replication or reducing of data to fit the other datasets. Point 2: I'm not sure I understand what you're saying. Yes the projection units depend on the projection but why does that matter? Each Area listed in the config will have a projection specified and that's how extents work currently.

djhoese commented 7 years ago

If we put the actual file format to the side for a moment, what if AreaDefinition had something like this for its __init__:

    def __init__(self, area_id=None, name=None, proj_id=None, proj_dict=None, x_size=None, y_size=None,
            area_extent=None, nprocs=1, lons=None, lats=None, dtype=np.float64,
            center=None, pixel_size=None, ...<other defining properties>...):

Knowing @mraspaud the way that I do and the arguments we've had on other projects I'm sure he'll hate how many arguments there are, but that's kind of the point. The above would be backwards compatible for anyone not using named arguments but would allow users to specify whatever they know and get a working AreaDefinition out of it.

Another option I can see working is to have class methods that create the AreaDefinition based on what "scheme" the user is using. Problem with that is a user might not know what scheme they are using, some of the parameters overlap between schemes, and whatever loads the config file would need to figure out which class method to call.

mraspaud commented 6 years ago

@davidh-ssec YAML support was added a while ago, can we close this ?

djhoese commented 6 years ago

No, I'd like to keep this open until we have all the available forms of creating an area. It is something I'd like to do soon for geo2grid but Kathy has asked for other things first.