AmadeusITGroup / workflow-controller

Kubernetes workflow controller
Apache License 2.0
24 stars 15 forks source link

Workflow with FAILED status #37

Open alexei-led opened 6 years ago

alexei-led commented 6 years ago

Currently, Workflow can have one of two possible statuses:

  1. WorkflowComplete - when all its steps are completed
  2. WorkflowFailed - when "Deadline" is exceeded

IMHO, it should be failed also when one of workflow steps fails (Job fails).

sdminonne commented 6 years ago

Yeah, in general, I concur. Problem is what are we going to do when a Job failed?

  1. Should we implement a retry policy? I think we should.
  2. When a job failed should we stop all the workflow? I think we should
  3. Remove the all the workflows? I think we should not.

Thoughts? @clamoriniere1A: ideas?

alexei-led commented 6 years ago

@sdminonne when Job fail, after backoffLimit retries, we can stop the Workflow (if a user wants to, by specifying some tag). And I think we should not remove the failed workflow, user can do it later if wants to. Keeping workflow generated jobs/pods will allow inspecting "Failure" and maybe "fixing" workflow for next run.

sdminonne commented 6 years ago

@alexei-led agree. Need to put together a proposal for the tags.