hammerlab / guacamole

Spark-based variant calling, with experimental support for multi-sample somatic calling (including RNA) and local assembly
Apache License 2.0
84 stars 21 forks source link

joint caller: output phasing information #389

Open timodonnell opened 8 years ago

timodonnell commented 8 years ago

The joint caller should optionally output a csv file that gives for pairs A, B of variants (both germline and somatic) at each sample:

One possible application for this data is to contrain phylogeny inference: if all the reads supporting variant A also support variant B, then mutation A probably occurred after B

JPFinnigan commented 8 years ago

It might be useful to also implement this logic in varcode/topiary. Presumably, one would want downstream tools to be aware of the presence and relative strandedness of secondary germline/somatic variants w/in a particular genomic distance (e.g. the length of a PGV peptide, to pick a specific example)