reanahub / reana-demo-cms-reco

REANA example - CMS reconstruction
MIT License
0 stars 23 forks source link

workflow factory: initial implementation #11

Closed tiborsimko closed 5 years ago

tiborsimko commented 5 years ago

Following sucessful tests in #8 #9 #10, we know that REANA is able to run CMS reconstruction for a variety of RAW samples (e.g. dataset SingleMu) and data-taking years (e.g. 2011).

(1) Design a first simple "workflow factory" script that will produce REANA workflow for given parameters. Example:

$ cms-reco --create-workflow --dataset SingleElectron --year 2011

The command should generate workflow in a given output directory that would be ready to run REANA, with any necessary input file information and configuration files and Python code snippets and whatnot.

For example, people could then do:

$ cms-reco --create-workflow --dataset SingleElectron --year 2011
Created `cms-reco-SingleElectron-2011` directory.
$ cd cms-reco-SingleElectron-2011
$ reana-client run 

(2) The necessary CMSSW released version and the configuration files will be in the future fully read from CERN Open Data records using cernopendata-client. Until the client is fully ready, the first implementation could have the snippets committed here and/or read from CMS's RAWtoAODValidation repository.

(3) The implementation should be extensible so that we could add easily additional arguments in the future, for example:

Note that (2) or (3) aren't to be implemented as part of this ticket, it is sufficient to think about this in order to choose underlying technology (e.g. Jinja templating, cookiecutter templating, or simply generate everything fully from Python via string templates).

See also musings in https://github.com/reanahub/reana/issues/189

lukasheinrich commented 5 years ago

just to connect threads: there is also make_workflow.py we use in the recast-workflow repo https://github.com/recast-hep/recast-workflow/tree/master that @AlexSchuy is working on to generate workflows from a bunch of combinatorial options