opengeospatial / ogcapi-processes

https://ogcapi.ogc.org/processes
Other
46 stars 45 forks source link

Other process encodings? #325

Open m-mohr opened 1 year ago

m-mohr commented 1 year ago

In Part 1 I found the following sentence:

The Core does not mandate the use of any specific process description to specify the interface of a process. Instead this standard defines and recommends the use of the following conformance class: OGC Process Description

This means I'd expect that I could for example use the openEO process encoding in OAP.

Requirement 11 in "Core" says:

The content of that response SHALL be based upon the OpenAPI 3.0 schema processList.yaml.

The processList.yaml refers to the processSummary.yaml though which has a very specific encoding in mind: http://schemas.opengis.net/ogcapi/processes/part1/1.0/openapi/schemas/processSummary.yaml

Thus, I think there's a conflict in the specification that should be resolved. Are we allowed to return e.g. openEO processes in /processses?

fmigneault commented 9 months ago

Because CWL (and openEO) do not have notion of an OGC API collection as a first class object, I don't think it would be possible to directly map a workflow making use of them to either.

I don't see why that would not be possible. Using a OGC API client that knows how to interact with the concept of an OGC API collection is not different than another script. Whichever code that runs to resolve the OGC Workflow, even if doing a late binding, could be converted to a CWL/openEO workflow dynamically, filling in any necessary convertion process between steps, and then running it. I usually prefer to have static workflows that have well established steps and connections, so the user knows exactly what they are running, but still dynamic resolution could be supported.

I enforce GeoTIFF and EDR at a particular hop, and either the client and server does not support GeoTIFF, the workflow validation will fail. But if I leave that open, maybe they will find out that they both support JPEG-XL and OGC API - Tiles and can interoperate that way.

Why not set them to use JPEG-XL directly? If they both support it, they should both advertise it, and it would possible to align them this way right of the start.

Maybe for very specific formats, allowing some flexible format matching could make sense, but I can see a lot of cases were that could have the oppopsite effect. For exemple, if the servers figure how they both support application/json, that has a high chance of failing. Even with more specific formats such as GeoJSON, if the second step expects a GeometryCollection, but the first process returns a Point, the workflow will fail even though everything "worked" according to schemas and formats. Relying on formats and schemas helps, but is not guaranteed to have contents that work together. When workfows are defined explicitly (eg: the "static" vs "dynamic" I refered to earlier), users building the workflow have to make conscious choices on building them, thus, leading in fewer unexpected errors.

m-mohr commented 6 months ago

Because CWL (and openEO) do not have notion of an OGC API collection as a first class object, I don't think it would be possible to directly map a workflow making use of them to either.

openEO has a load_collection process and recently also got an export_collection process. This is "first class" in openEO :-)

But I feel like we are departing from the original issue. Was there a final conculsion regarding the other process encodings?

Even if I'd try to convert openEO processes into a process summary encoding, the required version number would be missing. I'm also not quite sure yet what the metadata property is used for.

jerstlouis commented 6 months ago

Was there a final conculsion regarding the other process encodings?

We definitely want to allow for alternative encoding of processes, an OpenAPI encoding is one that I would be curious to experiment with.

If it would be useful to have the same resource also available as different encoding specific to openEO, that would require a different mediatype to negotiate it. Even if openEO community standards ends up using /proceses and/or /jobs differently than specified in Processes - Part 1, for the proposed Part 3 - openEO Process Graph Workflows requirement class the idea is that it would be possible to provide a process description using the OGC JSON requirement class of Part 1 (as discussed above, I think that should definitely be possible -- technically Processes - Part 1 is so generic that it can describe any kind of function that takes an input and generate an output).

the required version number would be missing.

You mean the version required by processSummary ?

If that information is not available that could just default to 1.0 ?

openEO has a load_collection process and recently also got an export_collection process. This is "first class" in openEO :-) But I feel like we are departing from the original issue.

Probably departing indeed from the original issue and we should move this to a new Collection Input / Output discussion issue but:

What I mean by "first class" is that there is a "collection" concept which is an alternate mechanism to the "by value", "by reference" (href) and nested process mechanisms of how spatiotemporal input data can be accepted as an input to a process, or generated on demand as an output of a process, for a given time, area and resolution of interest, which does not involve the use of an additional process (as far the Processes API / execution request goes). Collection Output does not involve creating anything external anywhere, and different execution for different Area/Time/Resolution of interest do not result in separate collections.

Though they are designed to effortlessly chain with each other, and could internally be implemented as processes, Collection Input and Collection Output are quite different beasts in terms of how they are defined:

@m-mohr I am curious to what extent your _loadcollection is equivalent to WHU's loadCube process as discussed in https://gitlab.ogc.org/ogc/T19-GDC/-/issues/57 ? (and @fmigneault similar question for your similar process )

In their case, supporting Collection Input (which are about local collections -- Remote Collections is actually the equivalent for OGC API collections from external APIs which requires the server acting as a client) would be as simple as internally converting:

   "data": {
      "collection": "http://oge.whu.edu.cn/geocube/gdc_api_t19/collections/SENTINEL-2%20Level-2A%20MSI"
   }

to:

   "data": {
      "process": "http://oge.whu.edu.cn/geocube/gdc_api_t19/processes/loadCube",
      "inputs": { "cubeName": "SENTINEL-2 Level-2A MSI" }
   }

The important aspect is that loadCube in this context does not mean requesting the entire collection -- only the spatiotemporal subset / resolution / fields relevant to satisfy the current requests (which may be coming in to the server as Collection Output client requests) needs to be retrieved. But the initial handshake with the remote server can all be established at the time the execution request is initially submitted (which for Collection Output is only when the client first did a POST of the execution requests, not for every coverage tile or subset requested later on).

The Collection Input req. class accomplishes a few things:

pvretano commented 3 months ago

SWG Meeting: 13-MAY-2024: There was some disucssion in the SWG today about using OpenAPI as the process description language. Basically, you do a GET /processes/{processesId} and you negotiate to a Application/vnd.oai.openapi+json;version=3.0 response. What you get back is a small OpenAPI document with the description of that one process.

m-mohr commented 3 months ago

just fyi: That only makes sense if you expose processes as HTTP endpoints, which is not the case for openEO. And that was the initial question. Can we have a conformance class that allows us to send openEO process descriptions via the GET /processes endpoint.

jerstlouis commented 3 months ago

@m-mohr A bit confused by your last comment...

Aren't there process description HTTP end-points in openEO?

I understand that there's no individual process execution end-points in openEO. But I thought this issue was about the description, for which I thought there are individual process description end-points for openEO processes? Thanks.

fmigneault commented 3 months ago

Both of these are requirements :

Allowing alternate negotiation formats for the process description/execution makes sense, but the APIs should at least provide the minimal endpoint requirements to allow these negotiations to take place.

m-mohr commented 3 months ago

We only have a single GET /processes endpoint which describes all endpoints according to the process definition language, there's no GET /processes/:id yet.

bpross-52n commented 2 months ago

SWG meeting from 2024-05-27: Add information about how we intent alternative process encodings to interact with the /processes and /processes/{id}. This will be based on content negotiation. Wherever we describe a process description with regard to API Processes, include the media type. Expand the content of section 7.10, include example of OGC process description and example of OpenAPI description.

m-mohr commented 2 months ago

Please keep in mind that for example both OGC API Processes and openEO use application/json as media type, so content negotiation might be difficult.

pvretano commented 2 months ago

@m-mohr the discussion was that we would define media types that were not the generic ones. So to get an OGC process list from /processes you would negotiate to something like application/ogc-proc-list+json ... or something like that. Anyway, I'll create the PR and everyone can review and chime in.

fmigneault commented 2 months ago

@pvretano Could application/json; proflie=urn:ogc:... and application/json; profile=urn:openeo:... be considered instead, similar to what https://www.w3.org/TR/dx-prof-conneg/#motivation proposes?

Using something like application/ogc-proc-list+json limits the use of other media-types, such as application/ld+json that could very well be used to represent an extended form of OGC/OpenEO processes.

pvretano commented 2 months ago

@fmigneault I said "something like". I am not proposing application/ogc-proc-list+json be the mediatype. I was just using that as an example.

One question, is "profile" a valid parameter for the application/json media type? I don't remember reading that in the application/json media type.

fmigneault commented 2 months ago

Of course :) Just proposing ahead to consider this use case while I had it in mind.

The application/json RFC indicates that it has no optional parameters and no additional parameters defined. However, the Content-Type header RFC seems to indicate that subtypes may specify additional parameters for its own use. Therefore, unless specific uses are reserved by the type, other parameters are somewhat open (my interpretation at least). It also indicates that parameters are modifiers of the media subtype, but that they do not fundamentally affect the nature of the content.

For what it's worth, I have seen browsers respect filename*="xyz.json" used to indicate the recommended name when downloading the JSON content as a file, so I do not think there is a limitation about non-explicitly listed parameters. Theprofile` name seems fairly standard for other structured formats such as XML, so I think it would be a reasonable use it in the case of JSON.

I've also seen Accept-Schema, Accept-Profile and Prefer: schema=x in some cases, so I don't think there is a unified way regardless of the choice.

jerstlouis commented 2 months ago

@pvretano The media type ( https://www.iana.org/assignments/media-types/application/json ) does not define any parameters

I do believe we should stick to application/json for the OGC Process description for compatibility with 1.0.

But I think what @fmigneault is pointing out is that the way negotiation by profile works is that you can always add that profile= to any media type, if I understand correctly? This is the related common issue: https://github.com/opengeospatial/ogcapi-common/issues/8