INSPIRE-MIF / gp-ogc-api-features

Good Practice document for INSPIRE download services based on OGC API - Features
12 stars 12 forks source link

Good practice for on-demand extracted features #70

Open tervo opened 3 years ago

tervo commented 3 years ago

OGC API Features is a somewhat obscure standard for some INSPIRE Datasets such as numerical weather predictions in Atmospheric Conditions and Meteorological Geographical Features, where data is extracted from multidimensional data cubes during each request.

In those cases, features do not exist as is until they are requested. The main challenge in using OGC API Features for such data is to communicate possible filtering parameters. Consider, for example, Finnish Meteorological Insitute's HIRLAM weather forecast available at http://beta.fmi.fi/data/3/wfs/sofp/ (not INSPIRE good practice compliant). The user may request the data based on several query parameters including latlon and bbox. However, a definite predefined list of possible locations for latlon points does not exist. The user may instead request any point inside the data coverage, and the data is interpolated on-demand to that specific location.

The represented scheme works relatively well in practice, but it is not intuitive and not completely machine-readable because clients can not fetch the filtering possibilities and perform requests based on the API definition.

Multidimensional data cubes are considered in OGC API coverages. However, the great majority of the users are not interested in coverages nor have technical capabilities to fetch and process such data. For example, in the Finnish Meteorological Institute's open data and INSPIRE download service, both grid (coverage) and point time series are provided. During 2019, 99.7 percent of requests were point time series instead of coverage requests.

New upcoming OGC standard Environmental Data Retrieval (EDR) is designed especially to provide data from data cubes. The standard includes capabilities to offer both point time series and coverages. However, the standard is not published yet, and there is no ready server or client support.

I hope this group to either take a stand (in the good practice document) that using OGC API Features for cases where data is extracted on-demand is a good practice or to initiate new good practice work for such cases.

cportele commented 3 years ago

I may misunderstand what you are doing (in which case: forget this comment), but from your description it seems to me that you are doing two things on the Features resource in that API.

  1. You provide a features API where each value in the data cube is a feature. There is a finite number and you can iterate them (otherwise it would not be possible to publish this via a Features API) and query subsets.
  2. You provide API extensions on the Features resource that allow you to interpolate values.

EDR separates the two aspects into separate resources (Features and EDR query patterns). Your latlon/lonlat seems to be the coords in the position query. If you follow the same approach and separate the resources, I think the issue goes away.

In the context of INSPIRE Download Services I would argue that only item 1 from above is in scope. The only way to download the dataset is by downloading (parts of) the data cube, everything else is processing, which is not really in scope of the INSPIRE Directive (well, maybe through the obscure "services allowing spatial data services to be invoked").

tervo commented 3 years ago

You understood correctly, although we haven't defined any extension in this beta service. It is just designed to work that way.

Strictly speaking, only 1 is in the scope of INSPIRE. But to provide the services which are useful at all for the users, interpolation is required. Thus, I argue that 2 need to be considered somehow also in this context.

I also need to mention that a number of data points (a.k.a. features) in the data cube is too large to be used as the basis of filtering. First of all, one can not request all of them (regardless of paging). Also, in practice, filtering based on original data points would require overly complex requests compared to the task.

Thus, I still see that either EDR or some interpolation extension is required.