riga / law

Build large-scale task workflows: luigi + job submission + remote targets + environment sandboxing using Docker/Singularity
http://law.readthedocs.io
BSD 3-Clause "New" or "Revised" License
96 stars 39 forks source link

Workflow parameters #159

Closed riga closed 1 year ago

riga commented 1 year ago

This PR adds workflow parameters, a way to parametrize workflows in a way that makes the branch lookup dynamic, based on values in the branch map.

From the README:


Example code:

import law
import luigi

class MyWorkflow(law.LocalWorkflow):

    option_a = law.WorkflowParameter()
    option_b = law.WorkflowParameter(cls=luigi.IntParameter)

    @classmethod
    def create_branch_map(cls, params):
        return {
            0: {"option_a": "foo", "option_b": 123},
            1: {"option_a": "bar", "option_b": 456},
            2: {"option_a": "foo", "option_b": 456},
            # ... you could add more branches here
        }

    def run(self):
        # this run method is only called by *branches*, i.e.,
        #   - self.branch will be 0, 1 or 2, and
        #   - self.branch_data will refer to the corresponding dictionary in the branch map

        pass  # ... implementation

option_a and option_b are law.WorkflowParameters that can optionally define a cls (or inst) of a other parameter object that is used for parameter parsing and serialization (as always).

create_branch_map becomes a @classmethod and receives all task parameters in a dictionary params. This is necessary since the automatic lookup of branches based on the values of workflow parameters, internally, must happen before the task is actually instantiated.

As a result, parameters can be defined verbosely on the command line, translate to branch values, and configure which branches are run by the workflow:


Closes #131, #137.