gsksivesh / dagobah

Simple DAG-based job scheduler in Python
Do What The F*ck You Want To Public License
2 stars 1 forks source link

Add ability to manually bypass failed tasks #15

Open gsksivesh opened 5 years ago

gsksivesh commented 5 years ago

Issue by thieman Friday Jul 19, 2013 at 15:35 GMT Originally opened as https://github.com/thieman/dagobah/issues/22


Use case: I just had a task fail, so I went into IPython and screwed around in the interpreter until I fixed the bug (which occured half-way through a processing task) and let the task complete in my interpreter. Now, I want Dagobah to continue the rest of the failed job without having to run the entire task that I just completed manually.

gsksivesh commented 5 years ago

Comment by rclough Wednesday May 07, 2014 at 20:18 GMT


This would be a cool feature, here's 2 ideas I have for it:

  1. A "start from step " option that starts any workflow from a given step
  2. A "mark step as completed" option which might be cleaner. Ie after manual intervention, you mark the step as successful and the dependency chain continues as normal.

1 has additional use cases beyond failures, ie you have bad data from one point on so there's no explicit failure, but you dont want to run the whole job again. 2 is probably the simplest fix simply for manually correcting errors like in your use case.

gsksivesh commented 5 years ago

Comment by thieman Wednesday May 07, 2014 at 21:50 GMT


I think we'd want to do a slight variant on 2. Instead of marking as completed, we should add a fourth task state representing a skipped job. Job states currently include waiting, running, and failed. These states are used to determine what actions are possible at any given time: https://github.com/thieman/dagobah/blob/master/dagobah/core/components.py#L49

I think the clarity of adding a fourth state would be worth the extra effort. It would be weird to get a job completed email that told you Task 7 was completed when in fact you skipped it.

You should be able to manually mark a task as skipped in the UI and then retry the job. Skipped tasks should be counted as "completed" for the purposes of graph traversal and job completion.