andrej-fischer / cloneHD

High-definition reconstruction of clonal composition from next-generation sequencing data
GNU General Public License v3.0
39 stars 10 forks source link

How to get cloneHD and filterHD?

The current stable release, as well as pre-compiled executable binaries for Mac OS X and GNU Linux (64bit), can be found here. The cloneHD software is undergoing rapid development. Watch/Star this repo to receive updates.

Run a test with simulated data

After downloading cloneHD from the release site, you can test both filterHD and cloneHD by running

$ sh run-example.sh

where you can see a typical workflow of analysing read depth and BAF data with a matched normal. All command line arguments are explained below.

Compilation

For Mac OS X and GNU Linux (64bit), pre-compiled binaries are available here. To compile cloneHD yourself, you need the GNU scientific library (GSL) v1.15 or later. Change the paths in the Makefile to point to your local GSL installation (if non-standard). Then type

$ make

in the src directory. The executables will be in build. For debugging with gdb, use make -f Makefile.debug.

Report bugs

To report bugs, use the issue interface of github.

Full documentation

The full documentation can be found in the /docs/ subfolder. Click below.

What are cloneHD and filterHD for?

cloneHD is a software for reconstructing the subclonal structure of a population from short-read sequencing data. Read depth data, B-allele count data and somatic nucleotide variant (SNV) data can be used for the inference. cloneHD can estimate the number of subclonal populations, their fractions in the sample, their individual total copy number profiles, their B-allele status and all the SNV genotypes with high resolution.

filterHD is a general purpose probabilistic filtering algorithm for one-dimensional discrete data, similar in spirit to a Kalman filter. It is a continuous state space Hidden Markov model with Poisson or Binomial emissions and a jump-diffusion propagator. It can be used for scale-free smoothing, fuzzy data segmentation and data filtering.

cna gof baf gof cna post cna real baf post baf real snv gof

Visualization of the cloneHD output for the simulated data set. From top to bottom: (i) The bias corrected read depth data and the cloneHD prediction (red). (ii) The BAF (B-allele frequency), reflected at 0.5 and the cloneHD prediction (red). (iii) The total copy number posterior. (iv) The real total copy number profile. (v) The minor copy number posterior. (vi) The real minor copy number profile. (vii) The observed SNV frequencies, corrected for local ploidy, and per genotype (SNVs are assigned ramdomly according to the cloneHD SNV posterior). (All plots are created with Wolfram Mathematica.)

Tips and tricks

How to cite

The cloneHD and filterHD software is free under the GNU General Public License v3. If you use this software in your work, please cite the accompanying publication:

Andrej Fischer, Ignacio Vazquez-Garcia, Christopher J.R. Illingworth and Ville Mustonen. High-definition reconstruction of subclonal composition in cancer. Cell Reports (2014), http://dx.doi.org/10.1016/j.celrep.2014.04.055