xavierdidelot / ClonalFrameML

ClonalFrameML: Efficient Inference of Recombination in Whole Bacterial Genomes
GNU General Public License v3.0
108 stars 27 forks source link

95% CI for r/m #114

Closed jodsikorski closed 4 years ago

jodsikorski commented 4 years ago

Dear Xavier, I have seen in your ClonalFrameML paper (Plos Computational Biology 2015) 95% CI values for r/m, and I have seen in several Github issues (#99, #87) your explanations on how to calculate 95% CI for R/theta, delta, or nu, but I probably missed the explanation for how to determine 95% CI for r/m (I do know how to get r/m). Could you please explain this to me?

Thanks a lot, best, Johannes

xavierdidelot commented 4 years ago

Hi Johannes, r/m is obtained by multiplying together R/theta, delta and nu. Best wishes, Xavier

jodsikorski commented 4 years ago

Hi Xavier, thanks a lot for the fast reply. Yes, I understood how to get r/m, but how do I get the 95% CI on r/m? Following post #99, is there a way to get a_post and b_post for r/m? Getting the 95% CI for R/Theta, 1/delta or nu should be straightforward, but how am I doing this for r/m? Thanks for your help, best, Johannes

xavierdidelot commented 4 years ago

You could compute the 95% CI for R/theta, 1/delta and nu, and then use the formula above to get the 95% CI of r/m.

jodsikorski commented 4 years ago

Ok, great, I will try this. thanks and best, Johannes

jodsikorski commented 3 years ago

Hi Xavier, sorry to bother you again on this issue. If I run ClonalFrame in branch-specific mode (-embranch true), how do I get the 95% CI for branches, inespecially that of r/m? I would approach this as shown below:

Here is an example from the output, which does not contain values for a_post and b_post: Parameter Branch Posterior Mean Posterior Variance a_post b_post R/theta Mean 0.150307 NA NA NA 1/delta Mean 0.0149936 NA NA NA nu Mean 0.219793 NA NA NA M Mean 0.000287215 NA NA NA

An example for a branch in this analysis is: R/theta Psal182 0.892049 0.00747211 106.496 119.384 1/delta Psal182 1.04044 0.0101648 106.496 102.357 nu Psal182 1.00657 0.00748065 135.439 134.556 M Psal182 0.305987 0.000586788 159.561 521.462

I do know how to proceed getting delta and r/m (first multiply branch-specific parameter by the mean value, then do R/theta delta nu = r/m). I know from #99 how to get the 95% CI in R: qgamma(c(0.025,0.975),shape=a_post,rate=b_post). But I have only the branch-specific a_post and b_post values, not from the Mean (the 4 first lines).

Is it correct to first determine the branch-specific 95% CI (using branch-specific a_post and b_post) and then to multiply these 95% CIs with the posterior mean from the MEAN? Example from above for R/theta and branch Psal182: (1) 0.892049 [branch value] * 0.150307 [MEAN] = 0.134081 R/theta for Psal182 (2) 95% CI for R/theta in Psal182 is :

qgamma(c(0.025,0.975),shape=106.49,rate=119.38) [1] 0.7306709 1.0692371 lower CI is then: 0.7306709 0.150307 [MEAN] = 0.109825 upper CI is then: 1.0692371 0.150307 [MEAN] = 0.1607138

The values would fit: R/theta for Psal182 would then be 0.134081 [ 0.109825 , 0.1607138]

If this is correct, I would proceed respectively with 1/delta and nu, and would then get the final 95% CI for r/m by submitting the 95%CI values of the individual parameter to the formular r/m = R/theta delta nu? With the above example this would then result in an r/m value (and 95%CI) for branch Psal182 of r/m = 1.91 [1.59, 2.23].

Does this makes sense? Thanks again for your help, best, Johannes

xavierdidelot commented 3 years ago

Hi Johannes,

Everything you wrote seems correct to me, well done for figuring it out by yourself! The branch-specific values need to be multiplied by the mean values in the top four lines as you wrote. See also issue #87

Best wishes, Xavier

jodsikorski commented 3 years ago

Hi Xavier,

thanks so much for the confirmation, I hope that I am then done with the analysis, best, Johannes