ILIAD-ocean-twin / data_access_api

Apache License 2.0
1 stars 0 forks source link

Iliad Data Access API

Iliad Data Access API is based on the OGC EDR API, that is designed to access MetOcean data through convenient API that supports coverage and vector data, Iliad profile will be defined. The profile consist of the schema profile, sample configurations of the reference implementations and persistent playground hosted.

Federated architecture of the Digital Twin can require several types of the interfaces:

This repository focuses on the data access APIs that is not covered by other tasks but shall be semantically integrated with these. At the same time, data models and querying shall be adoptable to the various protocols like event streaming and catalogs.

Groundwork

Based on the discussions identified ILIAD data includes various types:

Some of the pilots already provide data through legacy APIs like OGC WCS, WMS, WFS, OpenDAP, ERRDAP API. Variety of APIs can express variety of needs and applications that uses these accesses.

The approach taken is to define core suite of standard elements that can express the data in the alignment with the Ocean Information Model. OGC Environmental Data Retrieval API was selected as the starting point being modern interface developed few years ago in line with the ICT de-facto industry standards of APIs definition (Swagger/OpenAPI), allowing for semantic ‘uplift’ through JSON-LD extension of the poorly structured JSON, support of the raster and vector data, support of the Met Ocean and Marine Working groups and agencies behind.

Data Access APIs SUITE

Considering all the recognised scenarios, the API suite can contain:

Data Access Protocols dataset discovery support extended source information access method Semantic support of OIM
SensorThingsAPI no general level information all the fine grained metadata available for sensors, FoI, Thing OData/HTTP access to granular data, filtering and grouping OIM LD context/entailment
OGC API Coverages/WCS OGC API compliant limited in standard, available though extensions OpenAPI/HTTP, access to aggregates with trimming and resolution scaling OIM LD context/entailment
OGC API EDR OGC API compliant limited in standard, available though extensions OpenAPI/HTTP, access to aggregates with trimming though OIM LD context/entailment
OGC API Tiles/WM(T)S OGC API compliant limited in standard, available though extensions OpenAPI/HTTP, access to aggregates as tiles with trimming and resolution scaling OIM LD context/entailment
CF-storages in-file DDS metadata file and variable level key-values storage specific various
OpenDAP DDS based NetCDF-like HTTP NcML

Metadata role in data access

Metadata describing data assets is in various level integrated in the APIs s briefly mentioned in the table above. Most modern Web API like SensorThingAPI, Features API include generic layer metadata () in the standard

Spatial APIs and spatial cloud native storages

Recent review of the storage formats in the spirit of the cloud nativeness has consolidated as several data formats (like Zarr, Cloud Optimised Geotiff, GeoParquet, NetCDF+Kerchunk) that enable random access to data chunks. These formats can be indexed and accessed directly without dedicated application level server. It can be implemented as "serverless" data access which has several pros and cons.

Iliad specificity

Iliad efforts in the APIs development are tightly aligned to the standardisation processes, so the intention is not to design totaly new technical stack, but more to contribute to the existing and emerging standards with the experiences taken from the project and support standards APIs development where they does not exist.

However, with all the web APIs shall follow several requirements for the alignments:

SensorThings APIs

SensorThings API is the OGC standard endorsed as the INSPIRE good practice to share observations data. It is compliant with the ISO 19156/OGC Observations & Measurements standards. Entry level descriptions of the API are available on the Wikipedia

Use cases

STA is flexible standard useful in particular for:

Implementation steps

Coverage Data Access APIs Technical description

Base EDR is built on top of the OpenAPI specification and OGC APIs practice to support hierarchical, filterable and queryable discovery. Default encoding of the OGC APIs is JSON for M2M with HTML for human-machine interfaces (HMI) support. EDR supports binary data in NetCDF as well. EDR API reference documentation is:

Additional learning materials

In addition, OGC API Records is proposed in Iliad for metadata repository of datasets and data, and Sensor Things API for the sensor data and measurements.

Iliad APIs contains:

Potential further time_steps

Main functions

Like other OGC APIs, EDR provides entry point landing page with:

Description of the endpoints is provided in the API overview

Main resources that are exposed in the Iliad EDR are:

EDR APIs supports filtering collections with bounding box, time extent. for more advanced queries, OGC API Records is recommended. As the extension of the OGC API Features, it can supports multiple properties including free text search and CQL queries.

EDR API supports querying data in [multiple ways] (https://docs.ogc.org/is/19-086r5/19-086r5.html#toc44)

Multiple APIs

If multiple endpoints needs to be combined, reasonable approach is to organise them in the self-describing hierarchy. In this case root level landing page provides all the underlying links and conformance classes, while each of the sub-APIs referred has detailed information about the resources it exposes.

─/ <landing page>
 ├─ conformance <conformance classes listing both Features, Coverages/EDR and high-level STA conformance classes
 ├─ edr <combined OpenAPI definition for everything for data access>
 ├─ collections <OGC-API Records, with use-case specific views on the data>
 ├─ sta/v1.1 <STA interface, with detailed STA conformance classes on the landing page>

Example APIs suite

Relation to the other standards, APIs and data formats

ISO TC211 19* standards

OGC standards are well aligned to the ISO TC211 standards suite and largely implement standards like 19115 metadata model. in some cases, OGC standards are endorsed as ISO (OGC Features API) and sometimes they are jointly developed (Observation & Measurements used by the STA data model).

DCAT

Iliad profiles of the metadata documented in the Iliad Building Blocks register refers to the DCAT ontology though OIM definitions or directly. It is part of the canonical information model for dataset description.

Zarr

Zarr storage is getting popularity for the environmental data. It is inspired by the NetCDF and HDF formats providing access to multidimentional data adding chunking mechanism that enables access to selected chunks though HTPP range request. As legacy formats, it is not limited to geospatial information, while the profiles are expected to follow Climate and Forecast conventions and similar approach to the NCEI templates where relevant.

Integration with the API includes:

SeaDataNet

TBD but SeaDataNet is a combination of the ISO19115 profile for metadata, CF and own encodings for tabular data, so the overlap is in the abovementioned standards

Spatio Temporal Asset Catalogue (STAC), Catalog Web Service (CSW) and Records API

With the recently published OGC API Records, that is a superset of STAC (STAC is de-facto profile of Records), they became of interest of Iliad as potenially including all the metadata.

Iliad Service layer can generate STAC files. They are not going to be directly integrated in the APIs, while some metadata can be passed to the data catalog.

CF convention and NCEI NetCDF

CF convention defines profiles of the NetCDF format guiding how to use them for geospatial data (based on the climate and foreasts use cases). For example it define how to use dimensions, variables and attributes for geospatial data and what is relation between them (e.g. interpolation of measurement in variable and metadata in attributes).

NetCDF NCEI templates are NetCDF profiles aligned with the CF convention for number of use cases like time series, grided data, trajectories etc. https://www.ncei.noaa.gov/netcdf-templates

Both CF conventions and NCEI templates are used in the Iliad pilots to limited extent. Iliad proposes new validation mechanisms for the native version of the CF representation in Zarr format. There are several implementations of the CF validators for NetCDF and Zarr data maintained, but:

*DAP services

DAP services are user in several Iliad pilots using open implementations (THREDDS/ERDDAP). The same implementations are used by CMEMS, EMODNet and NOAA. They expose selected OGC Web Services (WCS, WMS) already. beyond that, DAP provide specific API for NetCDF like data both in request-response and streaming which is used out of the box in Iliad and is not part of considerations in this task. DAP services uses NetCDF DDS for metadata discovery and variables in domain, which shall be aligned with the information model. Ususally it is done though CF alignemnt.

Beyond vanila DAP, ERDDAP proposes their own API which is a combination of catalog (like Records and opensearch) and trimming and rescaling (like WCS/Coverages/EDR). This is area of potential alignment of the implementations which could be done after OGC API Coverages publication.

Acknowledgements

The work has been co-funded by the European Union, Switzerland and the United Kingdom under the Horizon Europe: