emmanuelparadis / coalescentMCMC

GNU General Public License v2.0
0 stars 1 forks source link

Output analysis and TMRCA of the linear model #2

Open damianfreilij opened 1 year ago

damianfreilij commented 1 year ago

Dear Emmanuel

I am writing because I have been running the coalescentMCMC function and I have several questions.

1) How to analyse the output of the package and get the average values of the simulations? From what I saw, it is suggested to use the coda package. I tried to use the following commands:

outlin2<-subset(outlin, burnin = 5000) HPDinterval(outlindef, prob = 0.95) #to get CI summary(outlindef) #mean and quantiles

However, in cases where some simulations yielded an NA, HPDinterval and summary didn't run and I don't know how to make those lines work...however I found this other way for the TMRCA but I don't know if it's correct:

TMRCA <- outlin2[5001:100000, 4]

outlin2[5001:100000, 4] #5000 as burnin

V<-as.vector(TMRCA) V2<-V[!is.na(V)] #remove NAs

library(Rmisc) CI(V2) #calculate confidence intervals

Is it correct? Is there a better way using other commands?

2) Based on the BIC the best model for my sequences is the linear one. I worked with the package previously, and for this model the output had the following parameters:

logLik, theta0, thetaT, TMRCA.

However, working now with version 0.4-4 the parameters are:

logLik.tree, logLik.coal, theta0, thetaT.

In this way, I don't know where to get TMRCA from, is it from some extra calculation? Should I try to install the previous version? Which one?

Thank you very much for taking the time to read me, Regards

emmanuelparadis commented 1 year ago

Hi Damian,

  1. Yes you are correct. I checked the code in coda and there does not seem a way to handle NA's (several functions in R have the option na.rm, such as max and min which are called by HPDinterval, but this is not implemented in coda). So I do not see a better way than what you did.
  2. The parameterization of this model has been changed: Tmrca is no more estimated, so there are now only 2 parameters (THETA at present and THETA at the Tmrca). Sorry the vignette has not been updated. If you want these Tmrca, you can get them using ape::branching.times after extracting the coalescent trees (see getMCMCtrees).

Best, Emmanuel

damianfreilij commented 1 year ago

Dear Emmanuel, thank you for your reply.

I tried to use branching.times but it doesn't work with a multiphylo. Is there any loop or set of functions you recommend me use to extract from all the trees and do the average?

branching.times(TRbt2) #TRbt2 is the multiphylo object retrieved using getMCMCtrees Error in branching.times(TRbt2) : object "phy" is not of class "phylo"

Thanks, Damian

emmanuelparadis commented 1 year ago

Since an object of class "multiPhylo" is also a list, you can do:

X <- sapply(TRbt2, branching.times)

But because it is very likely that the topologies will not be the same for all trees, you are interested only in the first value returned by branching.times which you can get with:

sapply(X, "[", i = 1)

Alternatively, in a single command:

sapply(TRbt2, function(x) branching.times(x)[1])

Cheers, Emmanuel