Closed nick-youngblut closed 4 months ago
You can find these options on https://derijkp.github.io/genomecomb/joboptions.html , or using
cg help joboptions
e.g. the queue can be specified using -dqueue
The job system is internal (already in use for a long time, but was never published separately). In functionality it is probably closest to snakemake (was also inspired partly by make, but with a different philosophy): You can restart/resume after an interruption by rerunning with the same command-line, but can also restart after e.g. an update/fix of input files, and only the dependent/affected results will be rerun. It is also possible to add new options (e.g. an extra variant caller) and only those will be run. Of course, like all others, you can run parallel locally or on a cluster. The main difference is that in typical jobsystems you define rules/processes containing code that are executed/strung together based on e.g. an requested results (snakemake); In the genomecomb system, job commands are embedded in procedural code; for these embedded blocks you specify dependencies and targets for that piece of code. This makes it very flexible and easy to test/debug: you can also run the code procedurally/non-parallel, even step by step on the REPL (you can even run parallel jobs on the REPL as well if you want to)
Thanks @derijkp for the detailed explanation!
From the howto:
I can't find any docs on how one can provide cluster-specific parameters (e.g., specifying a particular job queue). Is it possible to provide cluster-specific parameters?
More generally, what job submission system (software) are you using? I'm used to snakemake, nextflow, clustermq, and ray, so I'm trying to understand how your job submission system works relative to those tools.