toil pipeline for CNACS
toil_cnacs
CLI is divided into 3 steps; generate_pool
, finalise_pool
, and run
.
generate_pool
to create the reference files for a pool of normalsfinalise_pool
to confirm the thresholds for a pool of normalsrun
to run copy number analysis for tumor samplesNotice its required that you use a different jobstore for each sub-command, please see:
toil_cnacs --help
Currently only Targeted Panels and hg19 fasta files are supported Bam files can be gr37 or hg19
Docker and Singularity are supported:
# run with docker
toil_cnacs [STEP] [TOIL-OPTIONS] [PIPELINE-OPTIONS]
--docker papaemmelab/docker-cnacs
--volumes <local path> <container path>
# run with singularity
toil_cnacs [STEP] [TOIL-OPTIONS] [PIPELINE-OPTIONS]
--singularity docker://papaemmelab/docker-cnacs
--volumes <local path> <container path>
To install:
git clone git@github.com:papaemmelab/toil_cnacs.git
cd toil_cnacs
pip install .
This subfunction will allow you to create pool of normals for a specific panel. Use 5-10 normal samples of varying gender. Example:
toil_cnacs generate_pool \
{pool_dir}/jobstore_generate_pool \
--stats \
--writeLogs {pool_dir}/toil_logs \
--logFile {pool_dir}/toil_logs.txt \
--outdir {pool_dir} \
--probe_bed {panel bed} \
--fasta {hg19 reference fasta} \
--pool_samp {normal1 bam} {normal1 gender} \
--pool_samp {normal2 bam} {normal2 gender} \
...
Once you have generated your pool, use the pdf images in outdir/stats to the thresholds in outdir/stats/threshold.txt
This subfunction will finalise your thresholds for your pool of normals. Be sure that you have gone through the images in outdir/stats and set the thresholds in outdir/stats/threshold.txt
toil_cnacs finalise_pool \
{pool_dir}/jobstore_finalise_pool \
--stats \
--writeLogs {pool_dir}/toil_logs \
--logFile {pool_dir}/toil_logs.txt \
--outdir {pool_dir} \
--fasta {hg19 reference fasta}
After you have generated and finalised your pool of normals for your panel,
you can run the main pipeline on any number of tumors. Make sure to set pool_dir
to the location of your pool output directory
--samp
flag can be used to specify tumor bams and/or --samp_file
can be used to pass a file with a list of bams.
toil_cnacs run \
{outdir}/jobstore \
--stats \
--writeLogs {outdir}/toil_logs \
--logFile {outdir}/toil_logs.txt \
--outdir {outdir} \
--pool_dir {pool_dir} \
--fasta {hg19 reference fasta} \
--samp {tumor1 bam}
Contributions are welcome, and they are greatly appreciated, check our contributing guidelines!
CNACS core developers: Yusuke Shiozawa and Ryunosuke Saiki
CNACS have been described in Yoshizato et al, Blood 2017
This package was created using Cookiecutter and the papaemmelab/cookiecutter-toil project template.