Open-EO / openeo-api

The openEO API specification
http://api.openeo.org
Apache License 2.0
91 stars 11 forks source link

response on starting a job that is already queued/running/finished #442

Closed soxofaan closed 1 year ago

soxofaan commented 2 years ago

POST /jobs/{job_id}/results is mean to get a created job into queued/running state, and response with 202 The creation of the resource has been queued successfully.

But its unclear what the response should be when doing POST /jobs/{job_id}/results on a job that is already queued/running/finished/error.

This endpoint has no effect if the job status is already in status queued or running.

Doing nothing at back-end side is fine, but what should the response be? Also 202 The creation of the resource has been queued successfully.?

And what in case of a job that is already in status finished or error? While also "having no effect" is still fine I guess, it's a bit weird to respond with 202 The creation of the resource has been queued successfully.. Returning with 4XX/5XX also doesn't really fit, I'd think.

(related to #436)

m-mohr commented 2 years ago

Doing nothing at back-end side is fine, but what should the response be? Also 202 The creation of the resource has been queued successfully.?

The same. It's indeed a bit weird, but I'm also not sure it helps a lot to define distinct status code for all scenarios?

By the way, the text "The creation of the resource has been queued successfully." is something that I added as explanation to the spec. We can surely clarify this. The official status code text is just "Accepted". See https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/202 and the description seems to fit pretty well:

The HyperText Transfer Protocol (HTTP) 202 Accepted response status code indicates that the request has been accepted for processing, but the processing has not been completed; in fact, processing may not have started yet.

And what in case of a job that is already in status finished or error?

It starts again from the beginning. A finished, errored or canceled job can be restarted by queueing it again. We may want to add this behavior explicitly to the description though, right? Or should we leave it open so that the back-end can decide whether they want to allow this?

soxofaan commented 2 years ago

The same. It's indeed a bit weird, but I'm also not sure it helps a lot to define distinct status code for all scenarios?

Yes, fine. Clients usually don't expose the HTTP code and message, so no need for overkill at the moment I think.

And what in case of a job that is already in status finished or error?

It starts again from the beginning. A finished, errored or canceled job can be restarted by queueing it again.

I'm not sure. You could add the safeguard against accidentally overwriting results/logs that a finished/errored job first has to be canceled before it can be started again. Then the user has to give an explicit ok that any existing results/logs can be discarded.

This is roughly the current behavior of the VITO backend and I kind of exploit that sometimes when doing interactive experimentation in jupyter, e.g.:

m-mohr commented 2 years ago

You can't cancel a finished job anymore.

DELETE /jobs/:id/results says:

This endpoint only has an effect if the job status is queued or running.

Indeed, it could make more sense to allow deleting results for finished and errored jobs, and cancel a queued job. Then only allow a canceled job to be started again, but that's not spec'ed and is also not really backward-compliant. In that sense, the client should probably handle that differently. You can still handle it on the client-side if a job is finished and you send start again, e.g. asking for confirmation. Probably also something for JS/R/Web Editor.

m-mohr commented 2 years ago

Some improvements are in PR #444 . The functionality to ask for confirmation for restarting finished batch jobs is posted in several issues across the client landscape.

clausmichele commented 2 years ago

At EURAC we don't allow to restart a job if it has been successfully finished or if it's in the running or queued state. Only if it's the created or error states. We return an informative message telling to create a new job instead.

m-mohr commented 2 years ago

That's also fine, I think. Not 100% compliant, but works :-) I guess you can also restart if it was canceled?