The implementation would follow the simplest "Sequence" workflow pattern http://www.workflowpatterns.com/patterns/control/basic/wcp1.php. Basically, the engine would execute a shell command, and if exit status is OK, it would execute the next shell command, etc. If the exit status of one command would not be OK, it would exit with an error.
The serial workflow could be pictured as follows:
inputs
|
V
+-------+
| step1 | ... running in environment E1 with runtime-mounted code C1 on inputs I1
+-------+
|
V
+-------+
| step2 | ... running in environment E2 with runtime-mounted code C2 on inputs I2
+-------+
|
V
...
|
V
+-------+
| stepN | ... running in environment EN with runtime-mounted code CN on inputs I(N-1)
+-------+
|
V
outputs
In theory, every step of the workflow could run in a different computing environment (different docker image) with different runtime code and input parameters.
In practice, it would not be practical to go too deep that way. The main goal is to offer something simple for people looking for a Travis CI like definition of commands to run. (Think the use case of manipulating videos by running ffmpeg jobs on K8s cloud.) For people having advanced needs, we would be advising them to use the real feature-full workflow engines, CWL and Yadage.
Hence we don't want to go into specifying full tuples (step_i, environment_i, inputs_i, code_i, commands_to_run_i, outputs_i). It should be sufficient to mount input runtime code once for all the steps, or even to use the same environment for all the steps (step_i, environment_1, inptus_1, code_1, commands_to_run_i) which is sort of what Travis CI does. (Circle CI permits to specify different environments, I think.)
Option 2: use different environment in different steps
environments:
- type: docker
image: johndoe/filter-big
- type: docker
image: johndoe/filter-small
- type: docker
image: johndoe/plotter
workflow:
type: serial
steps:
- environment: filter-big
commands:
- run some shell command
- run another shell command
- environment: filter-small
commands:
- run something
- run something else
- run even more things
- finish up
- environment: plotter
commands:
- gnuplot plots.gnuplot myresults.csv
- gnuplot plots.gnuplot myotherresults.csv
Let's muse IRL during kick-off to come up with a very simple specification for those light users who are not primarily looking for a workflow engine, all the while permittting them to easily enter into the computational workflow domain to start using CWL, Snakemake, Yadage, etc solutions later.
r-w-e-serial
will provide an ultra simple serial/sequential workflow engine useful for people who may need to run a sequence of commands and who might be off-put by the complexity that CWL or Yadage might bring. See more musings about the motivation in https://github.com/reanahub/reana-demo-helloworld/issues/13 and https://github.com/reanahub/reana-client/issues/10#issuecomment-338906229.The implementation would follow the simplest "Sequence" workflow pattern http://www.workflowpatterns.com/patterns/control/basic/wcp1.php. Basically, the engine would execute a shell command, and if exit status is OK, it would execute the next shell command, etc. If the exit status of one command would not be OK, it would exit with an error.
The serial workflow could be pictured as follows:
In theory, every step of the workflow could run in a different computing environment (different docker image) with different runtime code and input parameters.
In practice, it would not be practical to go too deep that way. The main goal is to offer something simple for people looking for a Travis CI like definition of commands to run. (Think the use case of manipulating videos by running
ffmpeg
jobs on K8s cloud.) For people having advanced needs, we would be advising them to use the real feature-full workflow engines, CWL and Yadage.Hence we don't want to go into specifying full tuples
(step_i, environment_i, inputs_i, code_i, commands_to_run_i, outputs_i)
. It should be sufficient to mount input runtime code once for all the steps, or even to use the same environment for all the steps(step_i, environment_1, inptus_1, code_1, commands_to_run_i)
which is sort of what Travis CI does. (Circle CI permits to specify different environments, I think.)Option 1: use the same environment for each step
Option 2: use different environment in different steps
Let's muse IRL during kick-off to come up with a very simple specification for those light users who are not primarily looking for a workflow engine, all the while permittting them to easily enter into the computational workflow domain to start using CWL, Snakemake, Yadage, etc solutions later.