gavinha / TitanCNA

Analysis of subclonal copy number alterations (CNA) and loss of heterozygosity (LOH) in cancer
GNU General Public License v3.0
93 stars 36 forks source link

Normal_contamination_estimate #64

Open udp3f opened 5 years ago

udp3f commented 5 years ago

Hi,

I am running the snakemake pipeline for TitanCNA for tumor/normals, where my maxploidy is set for 2 (that is what we believe of our sample) and I am doing different cluster initializations. I am a bit perplexed with the normal_contamination estimate I get for ploidy2 state where the params file outputs 0.99 as the normal contamination for any fixed cluster solution, which means 0.01 is the tumor purity. The same sample when run with a max ploidy of 3 gives 0.23 as the normal contamination for cluster 1, which seems pretty close to what we expect to see. I am not sure what this difference means. If I stick to ploidy state 2 which we believe our sample is, for different cluster solutions, my tumor purity is only 1% which we can not go with. Could you please explain what this might mean.

gavinha commented 5 years ago

Hi @udp3f

Are you able to show a plot of one of the chromosomes so that we can see what might be happening.

Thanks, Gavin

udp3f commented 5 years ago

tumor_a_cluster01_chr10

udp3f commented 5 years ago

Hi Gavin,

Above is a chr10 plot for ploidy=2, cluster 1 for which normal contamination was 0.99

-Uma

gavinha commented 5 years ago

Hi @udp3f

The quality of the data seems really good and might not be the problem. Do you see this type of result for the rest of the chromosomes? If so, then perhaps the normal contamination estimate of 0.99 is actually reasonable?

-Gavin

udp3f commented 5 years ago

Hi Gavin,

The rest of the chromosomes look similar except for chr17 which has an event. Below is the figure. However, ploidy=2, cluster=5 has a normal estimate of 0.91 for which I am attaching chr10 image below. tumor_a_cluster05_chr10

tumor_a_cluster01_chr17

gavinha commented 5 years ago

Hi @udp3f Based on the event in chr17, the normal estimate should not be 0.91. It should be less than 0.1.

What are the estimates for the run with initializations of ploidy=2 and number of clusters of 1 (cluster01)?

This might be an issue with the optimal solution selection.

-Gavin

udp3f commented 5 years ago

It is 0.99 for ploidy=2, cluster01 (for fixed cluster size =1) & 0.91 for ploidy=2, cluster05 (for fixed cluster size = 5)

gavinha commented 5 years ago

Something funny is going on. It must be a bug. Would it be possible for you to share the RData file for the ploidy=2, cluster01 solution? If so, please send to gavinha@gmail.com.

Thanks, Gavin

udp3f commented 5 years ago

I mailed you the RData.

sahilseth commented 4 years ago

I am facing a similar issue, were there any insights from this example?

I have 3 cases where titan selects a very low purity solution. However, VAFs suggest a higher purity.

sahilseth commented 4 years ago

image

This one has a KRAS mutation with AF=0.3

image

Estimated purity from titan in this case, is 2.5%, however, by including CNV+SSM in phylowgs the purity comes out to be 70%.

Phi cluster numClust cellPrev purity norm ploidy loglik sdbw opt_phi opt_solution
2 1 1 1 2.41% 97.59% 2.042 -97664.4 1.0489 TRUE TRUE
2 2 1 1 24.98% 75.02% 2.381 -94448 2.8469 TRUE FALSE
2 3 1 1 24.84% 75.16% 2.4 -94403.4 2.777 TRUE FALSE
2 4 1 1 24.91% 75.09% 2.474 -94455.4 2.7487 TRUE FALSE
2 5 2 1     ,0.8056 29.28% 70.72% 2.456 -94267.3 2.3584 TRUE FALSE
2 6 3 1     ,0.8643,0.785 23.74% 76.26% 1.775 -93637.1 2.46 TRUE FALSE
2 7 4 1     ,0.8667,0.7883,0.5433 23.63% 76.37% 1.794 -93713.7 2.7261 TRUE FALSE
2 8 2 1     ,0.8416 12.25% 87.75% 2.032 -93187.1 NA TRUE NA
2 9 2 1     ,0.8464 12.24% 87.76% 2.039 -93267.5 3.1268 TRUE FALSE
2 10 2 1     ,0.8465 12.25% 87.75% 2.039 -93259 3.1161 TRUE FALSE
3 1 1 1 25.66% 74.34% 3.088 -94954.2 2.3421 FALSE FALSE
3 2 1 1 24.95% 75.05% 2.574 -94604.7 2.7984 FALSE FALSE
3 3 2 1     ,0.8069 29.51% 70.49% 2.542 -94403.8 2.9229 FALSE FALSE
3 4 3 1     ,0.8667,0.7361 32.09% 67.91% 2.539 -94362.8 2.3935 FALSE FALSE
3 5 4 1     ,0.8645,0.7608,0.6743 32.56% 67.44% 2.496 -94212.7 2.3396 FALSE FALSE
3 6 4 1     ,0.8185,0.7445,0.2326 31.85% 68.15% 2.281 -93328.3 2.0768 FALSE FALSE
3 7 5 1     ,0.8977,0.7976,0.4905,0.3858 31.34% 68.66% 2.78 -95516.7 2.4976 FALSE FALSE
3 8 4 1     ,0.7794,0.5641,0.3902 24.87% 75.13% 2.982 -96234.1 3.0184 FALSE FALSE
3 9 3 1     ,0.7551,0.4076 11.44% 88.56% 2.581 -93950 2.8904 FALSE FALSE
3 10 3 1     ,0.4281,0.301 10.92% 89.08% 2.46 -93488.9 3.1647 FALSE FALSE
4 1 1 1 94.05% 5.96% 4.046 -94910.8 1.6776 FALSE FALSE
4 2 2 1    ,0.793 42.36% 57.64% 4.119 -93931.3 1.5277 FALSE FALSE
4 3 2 1    ,0.819 40.36% 59.64% 3.993 -93884.2 1.4405 FALSE FALSE
4 4 3 1     ,0.8717,0.7593 41.62% 58.38% 3.981 -93809.6 1.6194 FALSE FALSE
4 5 4 1     ,0.8622,0.7674,0.6836 41.40% 58.60% 3.962 -93801.7 1.4913 FALSE FALSE
4 6 4 1     ,0.8626,0.7678,0.6842 41.40% 58.60% 3.961 -93818 1.5302 FALSE FALSE
4 7 4 1     ,0.8676,0.7675,0.6903 41.07% 58.93% 3.988 -93842.2 1.438 FALSE FALSE
4 8 4 1     ,0.8661,0.7665,0.6914 41.03% 58.97% 3.985 -93852.8 1.3104 FALSE FALSE
4 9 4 1     ,0.8634,0.7754,0.7069 40.99% 59.01% 3.959 -93835.3 1.6504 FALSE FALSE
4 10 4 1     ,0.863 ,0.775 ,0.7066 40.96% 59.04% 3.958 -93838.4 1.7349 FALSE FALSE