Open-EO / openeo-api

The openEO API specification
http://api.openeo.org
Apache License 2.0
91 stars 11 forks source link

Support partial/incomplete batch job results #430

Closed soxofaan closed 2 years ago

soxofaan commented 2 years ago

https://github.com/Open-EO/openeo-api/blob/f303d65a3291d4cd74dacc0e796803bb5d6fa03b/openapi.yaml#L3044-L3045

The API currently requires that a batch job is fully finished before results can be requested with /jobs/{job_id}/results. In the context of very large batch jobs I think it could be useful to relax this and already allow listing of incomplete results (properly indicating that result is incomplete of course).

I'm not sure what is the best option: changing the behavior of the existing endpoint /jobs/{job_id}/results, or adding a parameter to enable this partial listing, or adding a new endpoint, or ...

re: https://github.com/openEOPlatform/architecture-docs/issues/12

soxofaan commented 2 years ago

In the context of very large batch jobs .. it could be useful to ... allow listing of incomplete results

FYI: in the "large area" feature, we aim to split a very large job in smaller jobs at the level of the aggregator, and distribute this work as separate "sub" batch jobs on one or more back-ends. It would be frustrating if partial results would not be accessible because one "sub" batch jobs fails or is stuck.

m-mohr commented 2 years ago

Adding this to the existing endpoint sounds reasonable although it could be an issue that existing clients could start downloading and propagating partial results as "complete" as the semantic for providing partial results was not present before. My first idea was using status code 206 instead of 200, but that doesn't fully fit. How would you envision this to work with things like start_and_wait in the clients?

soxofaan commented 2 years ago

My current thought is to have explicit opt-in to view partial results to stay backward compatible (because it would behaviorally be a breaking change):

opt-in could be with a request parameter partial that is false by default

I'm not sure about the 206 status for partial results, as that seems to be more about returning a targeted subrange of the result, chosen by the user request. But I'm not that familiar with usage in the wild of 206, so I don't have strong opinion against it either.

m-mohr commented 2 years ago

Yeah, I think a parameter works for me...

m-mohr commented 2 years ago

A first draft is now available in PR #433