This is a remake of the original 16S pipeline developed with in the Genome Institute of Singapore, designed for Illumina shotgun sequencing of 16S rRNA amplicon sequences (Ong et al., 2013, PMID 23579286).
The pipeline is based on
snakemake. The main
program (S16.py
) will write a config file (conf.json
) and
snakemake file (snake.make
) in the given output directory. These are
then used to call snakemake via qsub
using the also created wrapper
script snake.sh
. The system will send an email to you upon
completion (be it successful or not).
For help see S16.py --help
.
Only upon successful completion the output directory will contain an
empty file called COMPLETE
.
Results (abundance tables and piecharts) can then be found in
results
subdirectory (see report.html
there).
ratios.txt
for
corresponding ratios).results
subdirectory of the output directory. Pairwise identity thresholds
for the different taxonomi ranks are implemented as determined by
Yarza et al. (2014; PMID 25118885).If you want to first see what the pipeline would do, call S16.py
with --no-run
.
Then check the created files (see above).
To print all commands that would be run, use:
snakemake -s snake.make --configfile conf.json -n -p
To get a graphical representation of the workflow, run (from the output directory):
snakemake -s snake.make --configfile conf.json --dag --forceall | dot -Tpdf > dag.pdf
and have a look at dag.pdf
. Once you're satisfied just run bash snake.sh
.
This is tuned towards an in-house setup. If you want to replicate it
see the CONF variable in S16.py
, which lists expected binaries and
databases.