opengeospatial / ogcapi-geodatacubes

Other
4 stars 1 forks source link

The Processes vs. openEO elephant in the spec #8

Open jerstlouis opened 5 months ago

jerstlouis commented 5 months ago

From Testbed 18 GDC ER Critical Feedack:

  1. Currently, the user needs to select which way of processing method to follow: the OGC API - Processes or the openEO approach

Ideally the "user" should not have to deal with that -- the client / tools should use whatever they and the server support automatically, without the user being aware of the internal details.

The GDC API as it stands being basically two completely different APIs is the elephant in the room ;)

In my view, the combination of OGC API - Coverages Part 1 & 2 and OGC API - Processes Part 1,2,3 can do everything openEO does and more, but since it has so much flexibility, a particular profile would be needed to allow a client to expect a consistent set of things to be implemented (e.g., scaling, subsetting, mathematical functions and operations either as processes and/or as CQL2 expressions).

It should also be possible to write a façade on top of openEO-platform to implement such a profile of those APIs, and implementing openEO as a façade on top of OGC APIs is likely also possible.

It may be possible for some implementations to support both openEO and the OGC API approach at the same end-point, assuming any remaining end-point conflicts are resolved through content negotiation by defining new media-type, with consistency in how OGC API - Common "Collections" is supported.

However, the only way to truly end up with a "single API" is if the GeoDataCube picks one approach, at least for the "Core Data Access" and "Core Processing" functionality.

In reality, except for the "User-defined functions" (which would require OGC API - Processes Part 1,2,3), I believe most of what can be done with openEO could be done strictly with OGC API - Coverages - Part 1 & 2 (with joinCollections=, filter=, properties=, CQL2 and Well-Known functions).

Given that openEO by itself is likely to move ahead as a Community Standard, it seems to me that the Core GDC API should really focus on profiling OGC API - Coverages and OGC API - Processes to provide equivalent (and additional) capabilities.

The ability to use openEO Process Graphs as a potential representation of a Workflow, as drafted in Section 13 - openEO Process Graph Workflow of Processes - Part 3 still feels like the key potential integration point for workflows integrating openEO components within a workflow.

See also https://github.com/opengeospatial/ogcapi-geodatacubes/issues/1#issuecomment-2081220421 and Testbed 19 GDC ER Section 4.1 Profiles proposal by Ecere for a proposal of how a GeoDataCube can be accessed regardless of how (e.g., openEO process graph, OGC API - Processes execution request, WCPS...) the definition of a workflow generating it is defined.

m-mohr commented 3 months ago

OGC API - Processes has nothing in the specification that makes it specific to data cubes. openEO has. If anything is dropped, OGC API - Processes should be dropped from the GDC API.

jerstlouis commented 3 months ago

@m-mohr

OGC API Processes - Part 3: Workflows "Collection Input" / "Collection Output", combined with data access mechanisms such as OGC API - Coverages, allow to take data cubes as inputs, do some processing on these data cubes, and return a data cube as output.

There is quite a lot of tension between different approaches starting from multiple specifications and trying to come up with a single one, and it will not be easy to come to an agreement.

As per #10 , what I hope we can agree on to move forward (with both Testbed 20 and the GDC API standard) is the following:

If we can agree to the above: