For rare-disease, the best practices and expected number of candidate variants for each inheritance mode are known. The actual filtering is easily done with a tool like slivar. This is a necessary first step with the following limitations:
Note, it is early days for the project. It will produce high-quality SNP/indel candidates but you may need experience with nextflow to run it easily.
This project currently has workflow that can be run as:
# NOTE that you need to remove everything after \ on each line for the command to work
# the comments here are just for documentation purposes.
nextflow run -resume -profile slurm rare-disease.nf \
-config nextflow.config \ # a starting config is included in this repo. adjust from there.
--xams "/path/to/*/*.cram" \ # NOTE that this is a string glob
--ped $pedigree_file \ # see: https://gatk.broadinstitute.org/hc/en-us/articles/360035531972-PED-Pedigree-format
--fasta $reference_fasta \
--gff $gff \ # e.g. from: ftp://ftp.ensembl.org/pub/current_gff3/homo_sapiens/
--slivarzip gnomad.hg38.zip \ # from: https://github.com/brentp/slivar#gnotation-files
--cohort_name my_rare_disease
See this wiki page for more information about how to use the output.
This does:
And the key output will be in: results-rare-disease/${cohort_name}.slivar.candidates.tsv
which is something one can easily view in excel or other spreadsheet software.
In addition, it will create: results-rare-disease/${cohort_name}.jigv.html
and results-rare-disease/jigv_plots/*
which together provide an HTML table and interactive igv.js views of each variant and associated alignments that do not rely on the original alignment files.
In coming releases, this will:
currently, octopus is included as
a separate workflow. This octopus.nf pipeline will detect trios and families
and run them together and then iteratively merge across families using the
n+1
schema described in the octopus
docs
Finally, the workflow will do the forest filtering as recommended by the
octopus documentation.
We plan to integrate the octopus and deepvariant calls in the future.
Development and research is underway so that it will: