Open alahiff opened 5 years ago
If maximumRetries
is specified in the policies
for a job, a failed job will be retried up to this many times. There is currently no way to retry a job on another VM/cloud.
Currently maximumRetries
is being used in workflows at the job level, but it's also being used in promlet to as the number of task retries. Probably should have:
maximumRetries
or maximumRetriesPerJob
(job level)maximumRetriesPerTask
(task level)Added maximumRetriesPerTask
, and separated per-task from per-job retries in https://github.com/prominence-eosc/prominence/commit/c96406d07fa58c8dc9b5cd9e76e81f7a2bc370f0
Still need to actually handle job retries outside of workflows.
Should be possible for tasks to be retried on the same VM if they fail, as well as / or retrying tasks on a different VM.