Closed b-niu closed 4 years ago
It is the standard deviation i.e. square root of the variance. This is computed based on the posterior distribution of CCF for the cluster.
It would be cellular_prevalence + 1.96 * cellular_prevalence_std. That assumes the variables follow a Gaussian, which they likely don't. The posterior maybe multi-modal for example, though that is rare if the cluster has more than two mutations assigned. Probably better just to think of the standard error as relative measure of confidence to compare estimates between clusters.
This is expected. The CCF quoted is the mean value of the cluster the mutation is assigned to. This differs from PyClone where we compute the mean value of the CCF across the MCMC samples. The latter better represents uncertainty over clustering, but I suspect it makes little difference in practice.
Thank you, professor. My confusion has been answered.
Best regards, Bing
Dear Professor Roth,
There are some questions about the output file.
I think, generally speaking, std means standard deviation. Is it standard error in the case of pyclone-vi?
If we are going to calculate the 95% CI of the CCF of a mutation, should we calculate it as:
or
“size" is the size of each cluster_id.
I have noticed that, in the output of pyclone-vi, every mutation_id in the same cluster_id shared the same CCF and std, which is different with pyclone's sitution. Is that how it's designed?
Here are two examples.
PyClone-Vi
PyClone: