opengeospatial / ogcapi-discrete-global-grid-systems

https://ogcapi.ogc.org/dggs
Other
23 stars 8 forks source link

DGGS Processes - As another Conformance Class... #54

Closed geofizzydrink closed 1 week ago

geofizzydrink commented 2 years ago

Hi All,

I've been working through aligning the TerraNexus OGC API Server to the latest draft version of the OGC API DGGS definition document and I've hit an issue in relations to the best way to structure DGGS Processing API endpoints. While the current OGC API DGGS definition document does a really nice job at defining the Zone Query and Data Retrieval operations one would want to access a DGGS to do, there are a set of processes that are still missing - namely "how do we get data into a DGGS via OGC APIs".

One way is to leverage OGC API Processes where the DGGS API implementer defines and publishes a number of OGC API Processes on their API Server that perform background DGGS operations (which may be specific to both the DGGS in question and the infrastructure in which it is deployed). For example, a function to map data to DGGS Zones - there are many ways in which this can be done as some of our discussions within the group on this topic has demonstrated, so rather than attempt to force one specific schema to do this (and also allow for other custom processes to be applied) it seems logical to me that we leverage OGC API Processes here.

My question/uncertainty here is "what is the best practice to integrate OGC API Processes into OGC API DGGS?"

I see three possibilities (all perfectly implementable given the flexibility of DGGS infrastructures):

  1. No New Conformance Class
    • We provide a Best Practice Guide advising that any and all internal DGGS functions can be defined and used via a stand-alone implementation of OGC API Processes.
    • This will be, perhaps the simplest route from our standards drafting perspective; however, if a DGGS API developer picks up the OGC API DGGS standard (and definition document) it may be ambiguous as to how they should implement data ingestion into their server via OGC APIs.
  2. New DGGS Processes Conformance Class - with Processes endpoints at the root

    • This would look something like:

        /processes
        /processes/{processId}
        /processes/{processId}/dggs
        /processes/{processId}/dggs/{dggsRSID}
      • But where should the "Jobs" endpoints go?

        /processes/{processId}/jobs
        /processes/{processId}/jobs/{jobId}
        
        or, 
        
        /processes/{processId}/dggs/{dggsRSID}/jobs
        /processes/{processId}/dggs/{dggsRSID}/jobs/{jobId}
      • To me this appears to break the OGC API Processes pattern.
      • It would also require any DGGS process to include the DGGSRSID as an input parameter. And, while perfectly above board from an OGC API Processes perspective (i.e. same as including an input parameter for a collectionId) it could lead to ambiguities in terms of managing job statuses across multiple DGGS instances.
  3. New DGGS Processes Conformance Class - with DGGS endpoints as the root

    • This would look something like:

        /dggs/{dggsRSID}/processes
        /dggs/{dggsRSID}/processes/{processId}
        /dggs/{dggsRSID}/processes/{processId}/jobs
        /dggs/{dggsRSID}/processes/{processId}/jobs/{jobId}
    • This seems much neater to me and seems to fit better thematically with the other DGGS API Conformance Classes. For Example:

      Endpoint OGC API DGGS Conformance Class
      /dggs DGGS Definition
      /dggs/{dggsRSID}/data DGGS Data Retrieval ("what is here?")
      /dggs/{dggsRSID}/zones DGGS Zone Query ("Where is it?")
      /dggs/{dggsRSID}/processes DGGS Processes
    • It would also provide a way to consistently access and track processes from both the DGGS infrastructure level and the collection level. For example:

      *DGGS Infrastructure Level*
      /dggs/{dggsRSID}/processes/...
      
      *Collection Level*
      /collections/{collectionId}/dggs/{dggsRSID}/processes/...

Any thoughts/comments/suggestions????

ghobona commented 2 years ago

Testbed-16 explored a similar question of how to building a DGGS API that is based on OGC API - Processes.

Here are some relevant sections from the engineering report.

https://docs.ogc.org/per/20-039r2.html#_aligning_dggs_to_ogc_api_process

https://docs.ogc.org/per/20-039r2.html#process_api

geofizzydrink commented 2 years ago

OK.

Based on a very useful discussion during the DGGS SWG telecon today it appears that we infact do not need to create a separate conformance class under OGC API DGGS to cater for the driving of internal/backend DGGS operations/processes. This is because all of the functionality we need (at least thus far) can already be delivered via a standard implementation of OGC API Processes.

So what we do need to include int he OGC API DGGS spec is a statement and guidance to the use of OGC API Processes and how to link those resources to OGC API DGGS (similar to what is included in the OGC API Tiles Spec in regards to OGC API Styles).

So I see the API architecture being something similar to this:

DGGS Process Resources

(e.g. mapping/tagging data to the zones of a DGGS)

DGGS API Resources

OGC API DGGS Zone Query Conformance Class

OGC API DGGS Data Retrieval Conformance Class

AkexStar commented 1 year ago

@geofizzydrink Hi~! I am working on the designing of AA-DGGS(eg: GeoSOT) APIs. But I am a little confused. I wonder: what's the difference between /dggs/{dggsRSID}/zones/{zoneID}/data and /collections/{collectionID}/dggs/{dggsRSID}/zones/{zoneID}/data? The former returns a list of all available data which can be indexed by the current {zoneID}? The latter also returns a list of data, but its specific types are limited to the {collectionID}? I will be very grateful if you reply!

jerstlouis commented 1 year ago

@AkexStar /dggs/{dggsRSID}/zones/{zoneID}/data returns data for the entire dataset (all collections), whereas /collections/{collectionID}/dggs/{dggsRSID}/zones/{zoneID}/data will return data only for one particular collection.

Note that it does not return a "list" of data, but the data itself, in the negotiated media type (e.g., GeoTIFF, GeoJSON, netCDF, or some new specific format).

I imagine the main purpose of the dataset level /dggs end-point would be to integrate data from multiple datasets, and have all collections available. The client would use this end-point in combination with properties= and or filter= and CQL2 expressions to specify exactly what should be returned. The server would likely have the permission to refuse returning everything (similar to Permission 5 in OGC API - Tiles.) Another option is to still use /collections/{collectionId} but have a joinCollections= parameter, as proposed here.

AkexStar commented 1 year ago

@jerstlouis Thank you! Help a lot

jerstlouis commented 1 week ago

As suggested by @geofizzydrink above, the candidate OGC API - DGGS Standard mentions how resources for Zone Data and Zone Query can be attached to a collection created using an OGC API Processes - Part 3 Workflow, using the "Collection Output" requirement class in:

https://docs.ogc.org/DRAFTS/21-038.html#_overview_2

and

https://docs.ogc.org/DRAFTS/21-038.html#_overview_6

Therefore, closing this issue for now.

However, as pointed out by @cnreediii during the OAB review, OGC API - Processes - Part 3 is still draft, and the Processes SWG is also discussing whether "Collection Output" might end up in a separate "Part 5" which would focus on (Virtual) Collection Output, Collection Input, and input/output field modifiers, leaving Part 3 focused on workflow definitions (Remote / Nested Processes). I think the best we can do for now is keep this in mind and update this informative section of the candidate standard or do a corrigendum, depending on the evolution/timing of the relevant OGC API - Processes parts.