opengeospatial / ogcapi-environmental-data-retrieval

A Web API that provides a family of lightweight interfaces for accessing Environmental Data resources.
https://ogcapi.ogc.org/edr
Other
58 stars 26 forks source link

Add support for basic data summary #362

Open m-burgoyne opened 2 years ago

m-burgoyne commented 2 years ago

A common use case for the data returned by EDR query will be to generate a summary value from the data for the area and time of interest, adding optional support for basic data aggregation methods could improve performance for services by reducing the volume of data that is returned by queries.

The could be achieved by adding optional support for new query parameters to describe the methods and the axes to calculate across.

The proposal is to add functionlity to allow an EDR service advertise aggregation functionality (if it supports it), and for any support to be defined at an individual query type level.

This information could be added at the DataQuery metadata for collections in a service that supports data aggregation, by adding a new property that listed the available methods and valid axis combintions

An example of the suggested metadata description can be seen below:

"area": {
  "link": {
    "href": "http://example.service.org/collections/demo/area",
    "hreflang": "en",
    "rel": "data",
    "variables": {
      "title": "Area query",
      "query_type": "area",
      "output_formats": [
        "CoverageJSON",
        "GeoJSON"
      ],
      "default_output_format": "GeoJSON",
      "crs_details": [
        {
          "crs": "EPSG:4326",
          "wkt": "GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.01745329251994328,AUTHORITY[\"EPSG\",\"9122\"]],AUTHORITY[\"EPSG\",\"4326\"]]"
        }
      ],
      "aggregation": {
        "agg_method": [
          { 
            "name": "sum",
            "desc": "Compute a total from the requested data"
          },
          { 
            "name": "average",
            "desc": "Compute the average value from the requested data"
          },
          { 
            "name": "Max",
            "desc": "Compute the Maximum value from the requested data"
          },
          { 
            "name": "Min",
            "desc": "Compute the Minimum value from the requested data"
          }
        ],
        "agg_axis": [
          { 
            "name": "x,y",
            "desc": "Aggregates across spatial dimensions"
          },
          { 
            "name": "x,y,t",
            "desc": "Aggregates across spatial and time dimensions"
          },
          { 
            "name": "x,y,z",
            "desc": "Aggregates across spatial and vertical dimensions"
          },
          { 
            "name": "t",
            "desc": "Aggregates across the time dimension"
          },
          { 
            "name": "z",
            "desc": "Aggregates across the vertical dimension"
          }
        ]
      }
    }
  }
}

The agg_method property contains a list of the supported aggregation methods with descriptions and the agg_axis property contains a list of the valid axis combinations for the query with descriptions.

A client application could then specify the required aggregation in the query by adding agg_method and agg_axis query parameters.

for example:

http://example.server.org/collections/demo/area?coords=POLYGON((-2.052 52.925,-0.476 51.017,0.887 51.566,-0.608 52.911,-2.052 52.925))&parameter-name=Air Temperature&datetime=2022-04-25T22:00Z/2022-04-27T10:00Z&crs=EPSG:4326&f=CoverageJSON&agg_method=sum&agg_axis=x,y
chris-little commented 2 years ago

EDR API SWG 75 clarified that this aggregation is not 'regridding', which could be another enhancement, but a summary aggregation.

chris-little commented 2 years ago

Discussion at EDR API SWG76 clarified that the summary is for the full domain, and should specifically exclude sub-selection in the domain. Name of issue needs improving.

solson-nws commented 2 years ago

@dblodgett-usgs @chris-little @m-burgoyne -- Probably need to define the boundary when a process service would come into play versus extending EDR capabilities.

chris-little commented 2 years ago

@solson-nws Requiring data from more than one collection is definitely out of scope for summary aggregation, and in scope for API-Processes. E.g. picking out a max value versus combining wind components (u,v) to get speed and direction (ff,ddd).

chris-little commented 2 years ago

EDR API SWG 81 encourages implementaters to develop a proof-of-concept to explore the feasibility of the proposal.

chris-little commented 1 year ago

the API-Coverages are interested in summary stats