xavierdidelot / TransPhylo

Reconstruction of transmission trees using genomic data
http://xavierdidelot.github.io/TransPhylo/
GNU General Public License v2.0
60 stars 22 forks source link

Run time differences depending on tree time scale #16

Closed igoldsteinh closed 2 years ago

igoldsteinh commented 3 years ago

Hi there. I've been using Transphylo on some simulations for a while and noticed something quite curious. In my experience Transphylo has much faster run times if the input tree is on the scale of years rather than months. In my own work, trees on the scale of months can take 10+ hours to run, while the same trees on the scale of years can take less than hour. My current understanding of the method is that time scale shouldn't impact the analysis particularly, and this discrepancy has me worried about my understanding of the methodology. Any insights are appreciated. Code example using the Transphylo tutorial below, where the original tree in years takes around 1.5 seconds to run, a rescaled version of the tree in months takes around 23 seconds to run on my computer.

# simulating outbreak in transphylo, and checking if changing time scale matters
library(TransPhylo)
library(ape)
set.seed(0)

neg=100/365
off.r=5
w.shape=10
w.scale=0.1
pi=0.25

simu <- simulateOutbreak(neg=neg,pi=pi,off.r=off.r,w.shape=w.shape,
                         w.scale=w.scale,dateStartOutbreak=2005,dateT=2008)

ptree<-extractPTree(simu)
plot(ptree)
p<-phyloFromPTree(ptree)
plot(p)
axisPhylo(backward = F)

w.shape=10
w.scale=0.1
dateT=2008
ptm <- proc.time()
res<-inferTTree(ptree,mcmcIterations=1000,w.shape=w.shape,w.scale=w.scale,dateT=dateT)

year_time <- proc.time() - ptm
# year time is 1.5

# repeating the analysis using the same tree scaled to months not years
month_p <- p
month_p$edge.length <- month_p$edge.length*12
plot(month_p)
axisPhylo(backward = F)

month_DTL <- (max(ptree$ptree[,1]) - 2005)*12
month_ptree<-ptreeFromPhylo(month_p,dateLastSample=month_DTL)

month_w.shape=10
month_w.scale=12/10

ptm_month <- proc.time()
month_res<-inferTTree(month_ptree,
                      mcmcIterations=1000,
                      w.shape=month_w.shape,
                      w.scale=month_w.scale,
                      dateT= 36)

month_time <- proc.time() - ptm_month
# month time is 23.13

Thank you for your time, Isaac Goldstein

xavierdidelot commented 3 years ago

That's interesting, everything you did looks good to me and I don't know why there is this time difference. I'll have to investigate this further when I'm back from holiday in two weeks.

vnminin commented 3 years ago

Thanks @igoldsteinh and @xavierdidelot! @xavierdidelot, Isaac and I work together on this. We'll do some profiling in the meantime to see what functions become more time consuming after the change of the time scale.

xavierdidelot commented 2 years ago

Hi Isaac and Vladimir,

Thanks again for bringing this issue to my attention. There is a precision parameter called delta_t in the calculation of the prior probability of a transmission tree. This represents the size of the discretisation grid and was previously set to 0.01 with no easy way to change it. I have now added this possibility as an optional parameter of inferTTree. So when you run with months as your time unit, you can now set delta_t=0.01*12 and this should give you similar run time compared to when you run with years as your time unit. I'll close this issue for now, but don't hesitate to reopen if anything is unclear!

Best wishes, Xavier

igoldsteinh commented 2 years ago

Sorry for the late reply, that seems very clear. Thanks so much for looking into this.

igoldsteinh commented 2 years ago

Hi Xavier, I'm coming back to this now and realizing that we've actually mostly been using infer_multittree_share_paramrather than inferTTree. Any chance that delta_t could be changed for that function as well? Thanks again.

xavierdidelot commented 2 years ago

Sure, I've just added the delta_t parameter to infer_multittree_share_param in the latest commit.

igoldsteinh commented 2 years ago

Thanks again!