lcdb / lcdb-workflows

DEPRECATED. Please see https://github.com/lcdb/lcdb-wf
MIT License
1 stars 0 forks source link

Cluster config #38

Closed daler closed 8 years ago

daler commented 8 years ago

This PR adds a mechanism for supplying cluster config settings for each rule alongside all other settings for the rule, instead of maintaining a separate cluster config yaml file. If you want to submit the workflow as a set of cluster jobs, use the new lcdb/lcdb-submit.py wrapper. It will read the config, make a cluster config on the fly, and provide that cluster config inside a newly created batch script that can be submitted to the cluster.

I also removed the test workflow in this PR. The reason is that it was getting out of date with the changes in config and snakefile architecture. Most of the rules are already moved over to the mapping config anyway, and the rest will be incorporated into the rnaseq workflow.

I've updated the main README to reflect these changes, and to show the commands for how to test. The big change is that we no longer provide the top-level repo, rather, provide the path to the snakefile that you want to test as the first argument.

You can also provide --config to specify a particular config file to test. This will become useful as we incorporate more test datasets and as we build more comprehensive tests.

The new --cluster arg is what handles the cluster config yaml building. The full test can be done like this:

test/run_test.py workflows/mapping/Snakefile --cluster --build-env --clean

It will create a temp bash script and run it. That bash script looks something like this:

#!/bin/bash

set -eo pipefail

source activate lcdb-workflows-$USER-env
snakemake --directory workflows/mapping -s workflows/mapping/Snakefile clean
time snakemake --cluster-config test-cluster-config.yaml \
    --cluster "sbatch {cluster.args} --cpus-per-task={threads}" \
    --jobname "s.{rulename}.{jobid}.sh" \
    -j 999 \
    --rerun-incomplete \
    -T \
    --verbose \
    --directory workflows/mapping \
    -s workflows/mapping/Snakefile \
     \
    > workflows/mapping/Snakefile.log 2>&1

(so check workflows/mapping/Snakefile.log for the log output)