Closed wir963 closed 2 years ago
Hi #@Welles,
We actually marginalize (sum over all possibilities) the genotype, rather than estimating a fixed value. This allows the method to account for uncertainty in the genotype. The best explanation is in the supplemental material of the original PyClone paper.
You could in theory compute the probability of the different genotypes post-hoc conditioned on the cancer cell fraction (CCF) of the mutation. Basically you fit the model and then post-process by cycling through all genotypes for a mutation, computing the probability of the observed read counts given the CCF inferred and then normalize.
Cheers, Andy
Okay, that makes sense! Thanks @aroth85! I may have some further questions about how to extract the model parameters, etc. but this is a good start
Hey @aroth85,
I'm interested in extracting the copy number for a specific mutation
x
. For simplicity, let's assume that mutationx
is clonal. If the region including mutationx
has total copy number = 5 (major CN = 3, minor CN = 2), then mutationx
could theoretically occur 1-5 times per cell (I also think it would be reasonable to argue that mutationx
could theoretically occur 1-3 times because the major CN = 3 but this is not important to my question). I would like to know PyClone-VI's estimate of how many copies of mutationx
occurs in each cell?With the caveat that I have not gone through the math, I assume that PyClone-VI estimates this quantity for each mutation. Am I correct in this assumption? If so, how can I extract a point estimate for this value from the model?
Best, Welles