prominence-eosc / prominence

PROMINENCE server
Apache License 2.0
2 stars 0 forks source link

Support retries of jobs #64

Open alahiff opened 5 years ago

alahiff commented 5 years ago

Should be possible for tasks to be retried on the same VM if they fail, as well as / or retrying tasks on a different VM.

alahiff commented 4 years ago

If maximumRetries is specified in the policies for a job, a failed job will be retried up to this many times. There is currently no way to retry a job on another VM/cloud.

alahiff commented 4 years ago

Currently maximumRetries is being used in workflows at the job level, but it's also being used in promlet to as the number of task retries. Probably should have:

alahiff commented 4 years ago

Added maximumRetriesPerTask, and separated per-task from per-job retries in https://github.com/prominence-eosc/prominence/commit/c96406d07fa58c8dc9b5cd9e76e81f7a2bc370f0

Still need to actually handle job retries outside of workflows.