PoonLab / vindels

Developing an empirical model of sequence insertion and deletion in virus genomes
1 stars 0 forks source link

Clock rate not exploring parameter space #85

Closed jpalmer37 closed 5 years ago

jpalmer37 commented 5 years ago

Here is the ucld.mean prior in the .xml files I'm using:

<logNormalPrior mean="5.0E-5" offset="0.0" stdev="1.0E-4">
    <parameter idref="ucld.mean" />
</logNormalPrior>           

Here's a preview of the distribution: Screenshot from 2019-10-10 16-19-38

As you mentioned, using a wider distribution could help prevent the clock rate from sticking close to the mean value.

ArtPoon commented 5 years ago

What version of BEAUTI are you using? When I apply these settings in BEAUTI v1.8.4, I get the following:

Screenshot from 2019-10-10 16-31-09

jpalmer37 commented 5 years ago

Beauti v1.10.4

Maybe it's computing the standard deviation term differently since the term looks different (?). Just a guess.

ArtPoon commented 5 years ago

Simulate from the prior - it's the only way to be sure

jpalmer37 commented 5 years ago

I simulated values from the prior and these were some results:

With Mean = 5e-5 and Stdev = 1e-4:

[1] "12255-a.log"

     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
5.277e-07 9.079e-06 2.393e-05 5.418e-05 5.735e-05 2.077e-03 

        2.5%        97.5% 
1.883982e-06 2.553668e-04 

[1] "30622-a.log"

     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
2.218e-07 8.956e-06 2.077e-05 4.488e-05 4.774e-05 1.852e-03 

        2.5%        97.5% 
1.823632e-06 2.380033e-04 

[1] "30647-a.log"

     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
7.633e-07 8.490e-06 2.029e-05 4.124e-05 4.935e-05 6.154e-04 

        2.5%        97.5% 
1.846073e-06 2.510803e-04 

EDIT: I added in the 95% quantiles as you requested

jpalmer37 commented 5 years ago

Patient 30647 is the one major case that inspired our use of a clock rate prior.

Here are the posterior and ucld.mean traces without a clock rate prior: Screenshot from 2019-10-11 13-17-29 Screenshot from 2019-10-11 14-55-24

and then with one: Screenshot from 2019-10-11 13-17-33 Screenshot from 2019-10-11 14-56-17

More info:

[1] "30647-a.log" [WITHOUT PRIOR]
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
2.264e-05 3.072e-05 3.339e-05 3.398e-05 3.614e-05 6.063e-04 
[1] "30647-a.log" [WITH PRIOR]
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
-3.395e-04 -1.836e-05  5.025e-05  4.866e-05  1.165e-04  4.482e-04 

There are two or three more cases that have a small lag during the burn-in on the clock rate. Those cases are basically negligible compared to the substantial burn-in lag on patient 30647 shown above.

jpalmer37 commented 5 years ago

I added 95% quantiles to the clock rate values retrieved by sampling from the prior only (above)

jpalmer37 commented 5 years ago

This was solved by changing the mpibeast.py script on the cluster to reference my local version of BEAST v1.10 instead of BEAST v1.8.