liamrevell / phytools

GNU General Public License v3.0
198 stars 56 forks source link

make.simmap #114

Open navilo opened 2 years ago

navilo commented 2 years ago

Hi Liam,

I am using make.simmap and I have run into a problem with zero transition rates in the Q matrix for the SYM/ARD models, similar to that described on your blog Bug fix for make.simmap with asymmetric substitution model. I have run make.simmap with the following arguments: Q='mcmc', prior=list(use.empirical=TRUE), pi='estimated', model='SYM'/'ARD'. The function calculated all map trees using the same Q matrix that has one or more zero transition rates (Q='mcmc' should generally result into applying different Q matrices produced by mcmc to each map tree). It seems to me that as soon as mcmc runs into Q matrix with any transition rate close to zero, it will stuck to the matrix (this usually happens already during burnin). I have tried to apply tol argument, but it did not help. Then I checked the code and find out the tol argument functionality changed and it applies only if the sum of all elements in a row is lower than tol: https://github.com/liamrevell/phytools/blob/4a076837eda98d43f345c4b1d297037819186626/R/make.simmap.R#L221

However, this does not apply to my situation where each row has at least one non-zero value. Is it possible to change back the tol functionality?

Thank you

liamrevell commented 1 year ago

Can you provide a reproducible example of this issue?

navilo commented 1 year ago

Sure, please find enclosed Matrix.csv, Tree.txt and simmap-testcase.txt (r-script). simmap-testcase.txt Tree.txt Matrix.csv

Actually the problem starts already in fitMk, which is used to estimate the alpha parameter of priors. fitMk calculates zero transition rates for M->A and F->M when selecting the ARD model. This, in turn, generates a gamma prior distribution with alpha=0.0 for these two transitions in make.simmap when the option use.empirical=TRUE is used. My workaround was to specify the prior$alpha explicitly with a relatively high alpha value (at about the same order of magnitude as the next greater transition rate, which was 1.98 in my case). In the enclosed testcase, even alpha=0.01beta produced near zero transition rates (1e-124) and no changes in Q matrices. The expected mcmc behaviour was produced with alpha=0.5beta.

Adding additional/restoring previous check on Q matrix prevening individual transition rates from getting close to zero (as mentioned in my previous comment), might be also helpful to make the mcmc less sensitive to too different priors (I did not tried it).