reanahub / reana-server

REANA API server
http://reana-server.readthedocs.io/
MIT License
5 stars 37 forks source link

dag: estimate initial workflow complexity for Yadage workflows #360

Closed tiborsimko closed 3 years ago

tiborsimko commented 3 years ago

This task is the same as #359 but for Yadage instead of Serial workflows.

Here are some Yadage-specific musings.

(1) Note that Yadage can launch many stages in parallel. We should parse Yadage definitions, look for stages that depend only on "init", such as:

stages:
  - name: gendata
    dependencies: [init]

these will run intially.

We should not consider later workflow stages that depend on previous ones, such as:

  - name: fitdata
    dependencies: [gendata]

as these will come into the play only later.

(2) Note that we should take care of all subworkflows, see the BSM example where workflow/databkgmc.yml can launch many stages in parallel.

(3) For each stage that can run, we should extract the number of jobs and the memory they require. However, one step may launch multiple jobs, for example:

stages:
- name: skim
  dependencies: [init]
  scheduler:
    scheduler_type: multistep-stage
    parameters:
      input_file: {step: init, output: files}
      cross_section: {step: init, output: cross_sections}
      output_file: '{workdir}/skimmed.root'
    scatter:
       method: zip
       parameters: [input_file, cross_section]
    step: {$ref: 'steps.yaml#/skim'}

Here, the skimming stage will lead to running say 9 jobs, if the input looks like:

inputs:
  parameters:
    files:
      - root://eospublic.cern.ch//eos/root-eos/HiggsTauTauReduced/GluGluToHToTauTau.root
      - root://eospublic.cern.ch//eos/root-eos/HiggsTauTauReduced/VBF_HToTauTau.root
      - root://eospublic.cern.ch//eos/root-eos/HiggsTauTauReduced/DYJetsToLL.root
      - root://eospublic.cern.ch//eos/root-eos/HiggsTauTauReduced/TTbar.root
      - root://eospublic.cern.ch//eos/root-eos/HiggsTauTauReduced/W1JetsToLNu.root
      - root://eospublic.cern.ch//eos/root-eos/HiggsTauTauReduced/W2JetsToLNu.root
      - root://eospublic.cern.ch//eos/root-eos/HiggsTauTauReduced/W3JetsToLNu.root
      - root://eospublic.cern.ch//eos/root-eos/HiggsTauTauReduced/Run2012B_TauPlusX.root
      - root://eospublic.cern.ch//eos/root-eos/HiggsTauTauReduced/Run2012C_TauPlusX.root
    cross_sections:
      - 19.6
      - 1.55
      - 3503.7
      - 225.2
      - 6381.2
      - 2039.8
      - 612.5
      - 1.0
      - 1.0
    short_hands:
      - [ggH]
      - [qqH]
      - [ZLL,ZTT]
      - [TT]
      - [W1J]
      - [W2J]
      - [W3J]
      - [dataRunB]
      - [dataRunC]

The task should therefore statically analyse workflow, extract stages that are dependent only on init, and look for "scatter" paradigm, and look for the number of input array items, and the result will be (9, 8 GiB) in this case, where skimming will launch 9 parallel jobs initially.

audrium commented 3 years ago

Closing since it was implemented in https://github.com/reanahub/reana-server/pull/366