CSB5 / GERMS_16S_pipeline

Pipeline for Illumina shotgun sequencing of 16S rRNA amplicon sequences
14 stars 7 forks source link

This is a remake of the original 16S pipeline developed with in the Genome Institute of Singapore, designed for Illumina shotgun sequencing of 16S rRNA amplicon sequences (Ong et al., 2013, PMID 23579286).

Running the pipeline

The pipeline is based on snakemake. The main program (S16.py) will write a config file (conf.json) and snakemake file (snake.make) in the given output directory. These are then used to call snakemake via qsub using the also created wrapper script snake.sh. The system will send an email to you upon completion (be it successful or not).

For help see S16.py --help.

Only upon successful completion the output directory will contain an empty file called COMPLETE.

Results (abundance tables and piecharts) can then be found in results subdirectory (see report.html there).

Steps involved

Tip

If you want to first see what the pipeline would do, call S16.py with --no-run. Then check the created files (see above).

To print all commands that would be run, use:

snakemake -s snake.make --configfile conf.json -n -p

To get a graphical representation of the workflow, run (from the output directory):

snakemake -s snake.make --configfile conf.json --dag --forceall | dot -Tpdf > dag.pdf

and have a look at dag.pdf. Once you're satisfied just run bash snake.sh.

Setup

This is tuned towards an in-house setup. If you want to replicate it see the CONF variable in S16.py, which lists expected binaries and databases.