opengeospatial / ogcapi-routes

public repo for OGC API - Routes Standards Working Group
Other
10 stars 3 forks source link

/routes/{routeId}; sync vs. async and a Common /jobs #37

Open jerstlouis opened 2 years ago

jerstlouis commented 2 years ago

Reading through the latest specification and its conformance classes and dependencies, my understanding is that the main purpose of the GET /routes, GET /routes/{routeId} and DELETE /routes/{routeId} is to manage asynchronous execution. However, it also seems quite tangled up with the ability to preserve, share and manage computed routes, i.e. those capabilities are only described in conformance classes depending on the async conformance class, even though some text mentions in passing that they could be provided for sync as well.

I really think we should consider using /jobs and /jobs/{jobId} proposed as a future Common extension by Processes (also see https://github.com/opengeospatial/ogcapi-processes/issues/59 and https://github.com/opengeospatial/ogcapi-coverages/issues/66), because that is exactly what it is intended for. The schema for the job status is here: statusInfo.yaml. In this case we would define a new type of job which would contain "type" : "route". We could re-use many of the "API Requirements Modules" defined in there (and make progress towards making this a Common building block with 2 specifications using it). e.g. all of the following would apply:

The relevant sections and requirements are:

We would not re-use anything to do with /jobs/{jobId}/results however as that is specific to the processing outputs (unless we take Processes harmonization even further, see below, that would actually make a lot of sense). Instead, once the job completes there would simply be a link to the completed route (which may be at /routes/{routeId}, if that end-point is supported -- see below).

EDITED: I had somehow missed Requirement 6 B (I think I was searching through the specification for routeId -- could be easier to discover with something like: "after sub-resource of /routes, i.e. /routes/{routeId}).

If all this seems to complicated (or the Commonization would take too long), perhaps it could be excluded from Core and defined as a later extension? From performance discussions so far, it seems that we all hope for routing requests to complete in a matter of a few seconds anyways -- is async capability really needed in Core?

Another possibility is that we could potentially define separate relation types for listing routes (see below) than for calculating routes. This might allow e.g. to link to a routing process at /processes/RoutingProcess/execution, which already supports async requests using the Processes in almost the same async way. With these separate relation types, just as it does right now (if I understand correctly), the Routes: Core conformance class would still only require these two things:

The GET /routes, GET /routes/{routeId}, DELETE /routes/{routeId}, GET /routes/{routeId}/definition could still exist for the purpose of actually preserving and sharing routes, but instead be defined in a separate conformance class (e.g. "Routes Management") which would be independent of the way the routes were computed (synchronously, asynchronously or using the Processes approach). This would allow a client to know whether this functionality is available or not when the API only supports synchronous execution -- right now that is not possible.

Regardless of whether we go with this suggestion, the current text about the Prefer: header currently requires clients to be ready to handle both async and sync responses if conformance to async is declared. I see this as problematic as working interoperability between a client and server can suddenly be broken if a server then adds support for async and start returning async responses even though a client still expects sync. More importantly, handling async responses for clients is significantly more complicated than sync, so it would be preferable to go with the wording (or at least the logic) from Processes: a server can only return an async response if a Prefer: header was provided by the client and it declares conformance to async. This is handled in _Requirement 25_ which specifies conditions:

Conditions: The execute request is not accompanied with the HTTP Prefer header. C) The server SHALL respond synchronously if, according to the job control options in the process description [the conformance declaration instead for Routes], the process can be executed in either mode [sync would be required by Routes - Core].

@cportele @skyNacho

cportele commented 2 years ago

@jerstlouis

I disagree.

The scope of OGC API Routes are resources for routes. The current resources Routes (/routes) and Route (/routes/{id}) with the associated operations GET/POST and GET/DELETE respectively are perfectly inline with how one creates, fetches and removes route resources using HTTP. A persisted Route resource has different states while it is being computed. There is nothing "tangled up" as far as I can see.

No other resources like jobs are needed or useful. This may be how you implement the capability to compute routes, but that is an implementation consideration. From a resource point of view, jobs are not relevant for computing or managing routes.

This issue feels like reopening #15, I thought we had settled this.

jerstlouis commented 2 years ago

@cportele

This may be how you implement the capability to compute routes, but that is an implementation consideration.

It's not, we have not yet implemented async capabilities for either Processes or Routes. But as we would like to eventually implement both, and right now we want to implement job listing & management from the perspective of a synchronous client and synchronous routing API, this is me observing that:

This issue feels like reopening #15, I thought we had settled this.

The context is different: before we were discussing the ability to make (mostly synchronous) routing requests in similar way for either a Processes-based or Routing-based API. Now the context is trying to implement routes listing, retrieval and management independently of how the route was requested (e.g. sync or async).

This would also solve the problem that right now the only way to poll the status of a route involves a GET to /routes listing all available routes -- there is no way to poll the status of the one route that was created?

I realize that I was mistaken there, sorry. I somehow missed Requirement 6 B that says:

The response SHALL include a header Location with the URI of the new route that is a sub-resource of /routes.

To summarize, I had really packed multiple issues into this one, so maybe it would be easier to unpack them into separate ones. Would you agree with these two most important points and hopefully less contentious aspects of this issue:

The other more contentious aspects being any of:

The latter were more me thinking out loud, and I admit that they might reopen settled things from #15 etc., so it would only make sense to consider them if we are all in strong agreement on any of them.

cportele commented 2 years ago

@jerstlouis

Creating a separate "Routes Listing & Retrieval" conformance class defining the GET operation at /routes, /routes/{routeId}, and /routes/{routeId}/definition so that it can be relied upon by clients regardless of whether servers support the ASync conformance class or not

Yes, that makes sense to me and I would support it.

Changing text in 7.3 that says "Clients requesting routes from servers that implement this requirements class have to be prepared to receive routes asynchronously. " to include "if the client supplies a Prefer: header". This would ensure that clients supporting Core only can work with any implementation. It also aligns with the Processes 1.0 specification and what I think we had agreed made sense when we discussed this together with Peter.

In practice, I do not see much of a difference. In the current draft, if the sync-only client receives a 202, it will stop with the request. In your proposed change, if the server decides that it will only do async, it will send a 400 and the client will stop with the request. Different status code, same result. I think we should check, if we can find more evidence of one approach or another in current practices.

The other more contentious aspects being any of: ...

As discussed many times, jobs/processes as resources are not a natural abstraction for a Routing API for most users.

I agree that synchronous processing is more important and it could be an option to move the asynchronous class to a future extension.

jerstlouis commented 2 years ago

@cportele

Yes, that makes sense to me and I would support it.

Cool, thank you!

In practice, I do not see much of a difference. In the current draft, if the sync-only client receives a 202, it will stop with the request. In your proposed change, if the server decides that it will only do async, it will send a 400 and the client will stop with the request. Different status code, same result. I think we should check, if we can find more evidence of one approach or another in current practices.

Well the proposal was that servers would be required to support synchronous by the Core conformance class, as most readers would expect until they read the ASync conformance class and then become confused (many probably would not even bother to read it as they just intend to send synchronous requests).

While one could argue for servers supporting only async for EO heavy batch processing, it seems reasonable here to require sync support for the few seconds it takes to compute a route in order to keep things as extra simple and interoperable as possible for clients.

I agree that synchronous processing is more important and it could be an option to move the asynchronous class to a future extension.

That would likely settle the above question as well, and hopefully move in the direction that supporting an extension doesn't break the interoperability with clients implementing only Core. Note that the current async conformance class made more sense to me after I discovered Req 6 B, so I am not totally opposed to it being in Part 1. The permission to return async when the client did not indicate async-awareness, and the separation of job listing/retrieval, is what bothered me the most.

cportele commented 2 years ago

@jerstlouis

When I wrote "the server decides that it will only do async", this was in the context of a specific request. All OGC API Routes implementations must process requests synchronously, if they only implement Core, but that does not preclude them to throw an error, if there is a request that they reject, e.g., if the server decides that it would take too long. Based on the upcoming HTTP revision, a 422 response would probably be appropriate (not the 400 I mentioned in my previous comment).

cportele commented 2 years ago

With respect to Core 1.0, all aspects should now be covered by other issues, in particular #40.

cportele commented 2 years ago

Meeting 2021-12-09: We discussed as the remaining point whether the POST has to be sent to /routes or if that could be made flexible and the resource referenced using a link relation type. It is important to keep the Core simple. If such a capability would be supported, it would have to be in a future extension.