Open-EO / openeo-api

The openEO API specification
http://api.openeo.org
Apache License 2.0
91 stars 11 forks source link

Generic way for warnings/deprecations on API response #412

Open soxofaan opened 2 years ago

soxofaan commented 2 years ago

Use case: openEO platform aggregator implements /jobs listing the union of batch jobs across all underlying back-ends. At the moment the EODC back-end errors with something like "user ... does not exist and is not whitelisted", causing the whole aggregator response to fail too. The workaround I'm currently working on is to skip the failed response of EODC and just list the jobs from other back-ends (e.g. VITO), which is better than returning nothing. It is however not ideal because user will not see that jobs are possibly missing due to a (temporary) issue at one back-end.

I think it would be good to have a somewhat generic way to indicate that there were (non-fatal) errors/warnings/deprecations/issues for a returned response. For example:

soxofaan commented 2 years ago

This feature would be handy for https://github.com/Open-EO/openeo-aggregator/issues/18 (the generalization of OP's use case)

At the moment I'm most inclined to add a HTTP header to the response:

m-mohr commented 2 years ago

I definitely see the use case. I'm wondering whether HTTP status code 206 ("Partial Content") instead of 200 ("OK") would work here? It sounds like it would fit in and I'd hope that clients would handle it like a 200, but still could make use of it if they want to communicate that this was a partial result. But maybe a header is the better option as 206 is not allowed in the current spec and would be somewhat breaking.

Nevertheless, this seems a bit out of scope for the core API spec and would more belong into an extension that handles federation aspects (which may also include additions such as "openeo:backends" etc.).

soxofaan commented 2 years ago

would more belong into an extension that handles federation aspects

I see it broader than just a "partial-federation-result" thing. Also, the feature only makes sense if client libs support it, and the client side is usually unaware that it is working in a federated context.

m-mohr commented 2 years ago

In a non-federated environment, I don't see much use in a partial response though?! What's the use case there?

The (very simple) federation extension I'm currently writing would add a federation flag (or a list of backends) to capabilities, so clients supporting federation aspects would know that they are connected to a federation.

Still, the 206 status code would either need to go into core, too, or we need to use HTTP headers as right now the core API only allows returning 200. So we may indeed go for a HTTP header, which is better wrt to extensibility.

m-mohr commented 2 years ago

How should such a header be defined? Should it just be a flag or a list of back-ends missing (or contained) in the response?

Examples:

  1. in a federation with eodc, vito, wwu and eurac and wwu and eurac being offline: Federation-Missing-Backend: wwu, eurac?
  2. Federation-Backend-Offline: true
soxofaan commented 2 years ago

In this ticket, I'm aiming for something more generic than just the Open-EO/openeo-aggregator#18 issue. e.g. response headers like

X-OPENEO-WARNING: Collection listing is incomplete because backend "eodc" is down

X-OPENEO-WARNING: The UDP id differs from the process_graph_id URL parameter
m-mohr commented 2 years ago

Hmm... I'm a bit hesitant here. Could this lead to bypassing the logging for processing requests?

PS: The X- prefix has been deprecated in general.

soxofaan commented 2 years ago

Could this lead to bypassing the logging for processing requests?

I think it can be complementary to (job) logs:

m-mohr commented 2 years ago

Also, allowing this for all openEO endpoints in OpenAPI is quite cumbersome (and repetitive in the spec). Federation-Missing-Backend (or similar) would just be needed for the "list" endpoints (all GET), not all of them. Actually, we don't even need to use a header there...

I think you'd need to add the new header for non-GET requests to CORS, too, which makes it even more annoying to add.

m-mohr commented 2 years ago

Related PR: #419

m-mohr commented 1 year ago

There was a Warning header until 2022, but it is now deprecated: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Warning Bummer...

soxofaan commented 9 months ago

Feature that could help with communicating warnings/deprecations in certain situations is /validation endpoint. The python client (since version 0.24) automatically calls /validation and visualizes issues as warnings (logging.warning style), both on sync execution and batch job creation requests. This gives a backend the opportunity to flag (non-fatal) problems.

m-mohr commented 9 months ago

Sounds like a good idea, although I don't like mixing warnings and errors in the client. Shouldn't the Python client use logging.error?

We could extend the validation output to contain a "warnings" array in addition to the "errors" array. Then they can be distunguised better.

soxofaan commented 9 months ago

... mixing warnings and errors in the client. Shouldn't the Python client use logging.error?

Because validation support is somewhat experimental in some backends (VITO backend included) we didn't want to make this auto-validation feature a blocking thing: even if validation gives errors, we still let the sync evaluation or job creation go through. In that sense it is currently more informative, to get some real world testing of the validation feature.

In practice we even experience that users are now reporting these informative validation messages as real errors, even though their workflow is not actually blocked or failing at all. So in a way one could even argue that logging with level INFO is more appropriate at the moment. That being said, WARNING level is fine at the moment in my opinion.

We could extend the validation output to contain a "warnings" array in addition to the "errors" array. Then they can be distinguished better.

that would indeed be a good idea, and easy to add to the current schema