adamkewley / jobson

A platform for transforming command-line applications into a job service.
Apache License 2.0
256 stars 20 forks source link

Add job attempt support #35

Closed adamkewley closed 5 years ago

adamkewley commented 6 years ago

Currently, Jobson only attempts a job once then gives up with "success", "failure", or "aborted". However, some (internal) workflows would benefit from an attempts API that re-runs jobs automatically when they fail. The support would need to:

Ideally, this feature can be integrated into the existing API with no breaking changes. Maybe not, though, because downstream users are going to start seeing jobs with outputs changing (e.g. they request stdout from attempt 1, which is different from the stdout from attempt #2). I'd need to ensure there's no caching or immutability expectations in downstream clients.

This feature would reduce the amount of resubmissions made in prod (e.g. when a cluster is down) and enable developers/end-users to just rerun something under the existing job ID (rather than having to create a whole new job).

adamkewley commented 5 years ago

Dropped for 1.0.0, stale issue.