Illumina / ExpansionHunterDenovo

A suite of tools for detecting expansions of short tandem repeats
Other
79 stars 25 forks source link

ExpansionHunter Denovo

ExpansionHunter Denovo (EHdn) is a suite of tools for detecting novel expansions of short tandem repeats (STRs). EHdn is intended for analysis of a collection of BAM/CRAM files containing alignments of short (100-200bp) reads.

workflow

As shown in the figure above, the analysis workflow consists of two steps. During the first step, genome-wide STR profiles are extracted from the input BAM files. The STR profiles contain information about reads that originate in STRs longer than the read length. The second step involves comparing STR profiles to each other. The type of comparison depends on the dataset:

Analysis type Dataset
Case-control Cases are enriched in expansions of the same STR
Outlier Only a few cases are expected to contain the same STR expansion

For example, if a case-control analysis is applied to a dataset consisting of ALS patients and healthy controls, then it is expected to flag the GGCCCC repeat in C9orf72 gene as highly significant. On the other hand, if cases consist of samples from patients with diverse phenotypes, it might be appropriate to assume that there is no enrichment for any specific expansion and hence the case-control analysis is not appropriate. In this situation, an outlier analysis can be used to flag repeats that are expanded in a small proportion of cases compared to the rest of the dataset.

Features

Limitations

Documentation

See documentation for installation instructions, usage guide, and description of file formats.

License

ExpansionHunter Denovo is provided under the terms and conditions of the PolyForm Strict License 1.0.0. It relies on several third party packages provided under other open-source licenses, please see COPYRIGHT.txt for additional details.