Design: Partial Pipeline execution

bobcatfish commented 5 years ago

The work for this task is to design this feature and present one or more proposals (before implementing).

Expected Behavior

If a pipeline has many tasks and takes a long time to run (e.g. tens of minutes, or even hours), and one Task fails, it might be desirable to be able to pick up execution where the Task failed, with different PipelineParams (e.g. from a different git commit), so you can resume the Pipeline without having to rerun the whole thing.

Some ideas for how to implement this:

Fields in a PipelineRun which override which Tasks to run from / refer to a previous PipelineRun from which results should be taken
A tool which makes it easy to create a new Pipeline from an existing one which only runs a subset of the Tasks

It is also worth considering what this could be like via a UI: if one is viewing a Pipeline in a UI, and wants to re-run only a portion of the Pipeline, they probably want the user experience to be as if they were still running the same Pipeline, even if underneath a new Pipeline is created.

Actual Behavior

At the moment, if any Task in a Pipeline fails, your options to rerun the rest of the Pipeline would be:

Run the entire Pipeline again
Create a new Pipeline from the previous one which contains only the Tasks you wish to run

Additional Info

This originally came up in discussion about #39, in the context of whether or not we'd want to always use the same git commit from a source for all Tasks in a Pipeline, or if we wanted sometimes for a Task to always use HEAD. This would allow a user to change a repo, by updating HEAD, between Task executions.

The feature of partial pipeline execution could be an alternative to this.

bobcatfish commented 5 years ago

@BenTheElder, @cjwagner and some other Prow folks indicated that this would be a very desirable feature for them - particularly in a case where your pipeline has 2 phases, one that builds a bunch of stuff and then subsequent phases that use that built stuff, it'd be handy to be able to resume after the point where the stuff is built

gsaslis commented 4 years ago

Just to add (or rather try to help clarify) a use case here.

This is a very useful feature for long-running pipelines that probably fall outside the strict CI scope. Most pipelines I have in mind are essentially workflow automation pipelines and have external dependencies such as 3rd party systems that need to be up / reachable.

When such a pipeline fails at step 7/11, you really don't want to rerun the whole thing. The Jenkins Restart from stage feature is ideal for the pipeline to essentially pick up where it left off.

The problem with most Jenkins pipelines is that they are not written in such a way that restarting from any particular stage would be possible, as inputs / outputs of each stage (task) are not always well defined.

Coming to Tekton and finding inputs/outputs so explicitly declared, I almost see an opportunity whereby, once this feature is implemented, it will work on "all" Tekton pipelines, significantly widening the scope of problems tekton pipelines can be used to solve. (Plus anyone who relies on this on Jenkins will find it easier to migrate to Tekton).

As a final point, I would like to clarify that in my use case, support for restarting with "different PipelineParams" (as mentioned in the description) is not a necessary feature. I am sure people have use cases for that too, but I personally like the approach Jenkins takes here: you can either restart the whole pipeline with different params (new pipeline run), or restart from stage, when it failed, always with the same params (retry failed pipeline run, starting from failed task).

Hope this helps.