opengeospatial / ogcapi-environmental-data-retrieval

A Web API that provides a family of lightweight interfaces for accessing Environmental Data resources.
https://ogcapi.ogc.org/edr
Other
59 stars 26 forks source link

Provide EDR API Framework to support re-sampling #413

Open solson-nws opened 1 year ago

solson-nws commented 1 year ago

Use of the EDR API continues to grow with time. It has been used in recent Disaster Pilots and now the Climate Pilot provide both Analysis Ready Data (ARD) and Decision Ready Data (DRD) to users. Use of the EDR API also continues to grow at Met Centers around the world in support of specific programs. Users are starting to express interest and asking questions about EDR's ability to provide re-sampling support. As we all know, resampling is not a one-size-fits-all model. There are a number of ways to implement resampling, including linear approaches, non-linear approaches, and cubic spline approaches, just to name a few. In an effort to keep the bar level low, instead of exposing a variety of resampling approaches within the EDR API, perhaps it would be far better to provide a framework for providers to expose this as an optional EDR API option. If we chose such an approach, what are the key attributes that are common among the various approaches that would also us to establish a framework for resampling. I view this work to be similar to our approach for pub/sub. What do folks thing about this as a future extension to the EDR API.

Thoughts?

tomkralidis commented 1 year ago

Agree that this would be valuable. Food for thought:

If we can use any of the above, for exmaple, an EDR extension/conformance class/part (depending) could profile/constrain accordingly as well (unless it is decided to build out in EDR proper).

Having said this, probably valuable to document some use cases here for starters.

m-burgoyne commented 1 year ago

The original concept was for EDR to avoid this level of complexity but this requirement is an extension of the discussions in issues #362 and #398. #398 is a requirement to allow the user of the API to specify the resolution of a data response and #362 is a proposal to allow a user to define data aggregation methods. The capabilities proposed in #362 and #398 could be combined to provide resampling functionality

For example

"cube": {
  "link": {
    "href": "http://example.service.org/collections/demo/cube",
    "hreflang": "en",
    "rel": "data",
    "variables": {
      "title": "Area query",
      "query_type": "area",
      "output_formats": [
        "CoverageJSON",
        "NetCDF",
        "GRIB2"
      ],
      "default_output_format": "NetCDF",
      "resolution_intervals": {
        "EPSG:4326": {
            "x": [0,0.1,0.2,0.5,1,2,4,6,8,10],    
            "y": [0,0.1,0.2,0.5,1,2,4,8]
        },
        "EPSG:3857": {
            "x": [0,5000,10000],    
            "y": [0,4000,80000]
        }
      }      
      "crs_details": [
        {
          "crs": "EPSG:4326",
          "wkt": "GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.01745329251994328,AUTHORITY[\"EPSG\",\"9122\"]],AUTHORITY[\"EPSG\",\"4326\"]]"
        },
        {
          "crs": "EPSG:3857",
          "wkt": "PROJCS[\"WGS 84 / Pseudo-Mercator\",GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.0174532925199433,AUTHORITY[\"EPSG\",\"9122\"]],AUTHORITY[\"EPSG\",\"4326\"]],PROJECTION[\"Mercator_1SP\"],PARAMETER[\"central_meridian\",0],PARAMETER[\"scale_factor\",1],PARAMETER[\"false_easting\",0],PARAMETER[\"false_northing\",0],UNIT[\"metre\",1,AUTHORITY[\"EPSG\",\"9001\"]],AXIS[\"Easting\",EAST],AXIS[\"Northing\",NORTH],EXTENSION[\"PROJ4\",\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs\"],AUTHORITY[\"EPSG\",\"3857\"]]"
        }
      ],
      "aggregation": {
        "agg_method": [
          { 
            "name": "linear",
            "desc": "linear regridding"
          },
          { 
            "name": "nearest-neighbour",
            "desc": "nearest-neighbour regridding "
          },
          { 
            "name": "area-weighted",
            "desc": "area-weighted regridding"
          }
        ],
        "agg_axis": [
          { 
            "name": "x,y",
            "desc": "Aggregates across spatial dimensions"
          },
          { 
            "name": "x,y,t",
            "desc": "Aggregates across spatial and time dimensions"
          },
          { 
            "name": "t",
            "desc": "Aggregates across the time dimension"
          }
        ]
      }
    }
  }
}

The agg_method property contains a list of the supported resampling methods with descriptions and the agg_axis property contains a list of the valid axis combinations for the query with descriptions.

A client application could then specify the required resampling method in the query by adding agg_method parameter and use the agg_axis parameter to define the axis that are resampled.

for example:

http://example.server.org/collections/demo/cube?coords=bbox=-6.0,50.0,-4.35,52.0&x-resolutions=0.1&y-resolutions=0.2&z=1000,900,850,700&parameter-name=Air Temperature&datetime=2022-04-25T22:00Z/2022-04-27T10:00Z&crs=EPSG:4326&f=CoverageJSON&agg_method=linear&agg_axis=x,y
dblodgett-usgs commented 1 year ago

I agree that this could be a useful use case to support on the server side of an application stack. However, I think this feature is out of scope for EDR. I see three implementation pathways for it to be supported more broadly though.

  1. This would be a great application of OGC-API Processes.
  2. This is already a core use case for OGC-API Coverages.
  3. Multiple resolutions of a given dataset could be exposed as additional EDR variables to support differing use cases.
tomkralidis commented 1 year ago

Could EDR have guidance on how to do resampling in OGC API - Processes with a specific set of rules, for example? i.e. we provide .../items in the same approach using OGC API - Features, for example.

dblodgett-usgs commented 1 year ago

I like that idea. e.g. "EDR is for simple direct data retrieval, use cases like resampling should be supported by creating OGC API processes that extend EDR using a pattern like ... "

Is that what you mean?

tomkralidis commented 1 year ago

What about retrieving with resampling parameters? We can use/leverage/specify "by reference" to other standards without a rewrite/duplication.

Given the discussion in this issue, it looks like already have this functionality covered, maybe it's a matter of making this more "visible" in the spec, for example.

chris-little commented 1 year ago

After the various discussions above and offline, I think that re-sampling has the potential to complicate too much the API-EDR as envisaged and agreed. WE need to gather use cases from a number of users for the requirement, and we also need to distinguish between re-sampling, say between two slightly different grids, and interpolation between a two or more points, possibly to an extreme of dozens or hundreds of intermediate points.

Unless there is a large unmet demand, I think we do not need to standardise re-sampling in EDR. Perhaps we could ask for an extension, then we see how successful the extension is.

chris-little commented 1 year ago

Any extension implementation should be clear about exactly which use cases are being addressed.

This may also impact the work on restrictive/extensive profiles.