morrislab / phylowgs

Application for inferring subclonal composition and evolution from whole-genome sequencing data.
GNU General Public License v3.0
108 stars 54 forks source link

TitanCNA clustering solution selection #96

Closed peterpdu closed 6 years ago

peterpdu commented 6 years ago

I was wondering how sensitive phylowgs is to changes in the clustering solution selection of TitanCNA output. For instance, if I choose a 4 cluster solution versus a 3 cluster solution from Titan as cnv input to phylowgs (and also supply ssm data), will the resulting phylowgs solutions be very different?

jwintersinger commented 6 years ago

Hi @peterpdu,

PhyloWGS treats CNAs as pseudo-SSMs. As CNAs are supported by much larger genomic regions than SSMs, they will have a greater effect in driving the cellular prevalence of the clusters PhyloWGS assigns them to. I'm not terribly familiar with TITAN, but if the four-cluster solution and the three-cluster solution report substantially different cellular prevalences for their segments, these will likely push PhyloWGS to create different clusters -- e.g., if the four cluster solution has segments with CPs [0.9 0.7 0.4 0.1], these are likely different enough that it will push PhyloWGS to create four clusters around these CPs, with SSMs potentially driving the creation of additional clusters or being assigned to CNA-driven clusters.

Does that make sense?

peterpdu commented 6 years ago

Thanks, that's very helpful!