LLNL / maestrowf

A tool to easily orchestrate general computational workflows both locally and on supercomputers
https://maestrowf.readthedocs.io
MIT License
133 stars 43 forks source link

[Proposal] YAML Specification Improvements #147

Open FrankD412 opened 6 years ago

FrankD412 commented 6 years ago

So I've been looking at some of the limitations of the YAML specification as it is, and I've come up with a new structure for it.

name: step_name
description: Some human readable English here for the step.
cmd: |
    exec some commands here
restart: |
    exec a restart here
post: |
    exec a small-ish local step after the main step terminates
resources:
    sched: <string for type of adapter>
    nodes: <integer number of nodes for allocation>
    tasks: <integer number of tasks for program>
    procs: <integer number of processors to  use for tasks>
    gpus: <integer number of GPUs to use>
    cores per task: <integer of tasks to be allocated to each CPU>

The idea behind splitting out the resources is two fold:

  1. This organization makes it so that Maestro can have monikers like $(resource.<name>) and so long as the key exists in resources: it can be used. Users could then "manually" select which MPI they want to use in a fashion similar to mpirun -n $(resources.tasks) ... some --command --to run. Principally, user could even do shell math to computer MPI related values (and Maestro could even calculate high level settings inferring from statically set values).
  2. Consolidation of resource settings and the ability to allow for more flexible step-wise specification of required resources. Sensible defaults could be provided, and a higher level batch could be specified, and then override values with step-wise values. It opens up a clear and simple line of application of batch parameters that can be consolidated in a uniform fashion while still giving the user more control.
FrankD412 commented 6 years ago

As a note for consideration, I'm thinking about transitioning it to a dictionary where the name of the step is the key. That makes it cleaner instead of having hyphens everywhere, but that's more of an aesthetic.

FrankD412 commented 6 years ago

Here's an example of what a study block might look like:

study:
    make-lulesh:
      description: Build the serial version of LULESH.
      depends: []
      run:
          cmd: |
            cd $(LULESH)
            sed -i 's/^CXX = $(MPICXX)/CXX = $(SERCXX)/' ./Makefile
            sed -i 's/^CXXFLAGS = -g -O3 -fopenmp/#CXXFLAGS = -g -O3 -fopenmp/' ./Makefile
            sed -i 's/^#LDFLAGS = -g -O3/LDFLAGS = -g -O3/' ./Makefile
            sed -i 's/^LDFLAGS = -g -O3 -fopenmp/#LDFLAGS = -g -O3 -fopenmp/' ./Makefile
            sed -i 's/^#CXXFLAGS = -g -O3 -I/CXXFLAGS = -g -O3 -I/' ./Makefile
            make clean
            make
      resources:
          # Not needed, but illustrates the default.
          adapter: local

    run-lulesh:
        description: Run LULESH.
        depends: [make-lulesh]
        run:
            cmd: |
              $(LULESH)/lulesh2.0 -s $(SIZE) -i $(ITERATIONS) -p > $(outfile)
        resources:
            adapter: flux
            nodes: 1
            cores: 1
            tasks: 1
            cores per task: 1

    post-process-lulesh:
        description: Post process all LULESH results.
        depends: [run-lulesh_*]
        run:
            cmd: |
              echo "Unparameterized step with Parameter Independent dependencies." >> out.log
              echo $(run-lulesh.workspace) > out.log
              ls $(run-lulesh.workspace) > ls.log
        resources:
            # Not needed, but illustrates the default.
            adapter: local
FrankD412 commented 6 years ago

I'm considering renaming study to steps -- the file as a whole is a study, which I feel like is a little misnomer for the steps of the study.

FrankD412 commented 5 years ago

Discussion has moved to PR #151 -- just as an FYI to those who might be looking for more information.

FrankD412 commented 5 years ago

@gonsie -- this is the issue I mentioned with specification improvements. Feel free to post thoughts on data dependency here.