liamrevell / phytools

GNU General Public License v3.0
207 stars 56 forks source link

make.simmap appears to run slow #25

Open halabikeren opened 6 years ago

halabikeren commented 6 years ago

Dear Liam,

I have noticed that make.simmap runs slower that I expected. For example, generating 1000 mappings based on a tree with 50 taxa, using the following syntax:

mtrees<-make.simmap(tree,tree$states,model="ER", nsim=100,pi="estimated")

Took minutes. Using your suggestion in Github issue 22, running time was reduced, but I would like to avoid using parallelization.

I wonder if the long running time could be a result of not using the correction (equation 11 in Nielsen 2002) in history sampling in cases where the parent state is different from the child state. This could result in many repeated attempts to simulate a satisfactory mapping to a branch.

Could this possibly be the reason? if yes, what can I do to overcome this?

Many thanks! Keren

halabikeren commented 6 years ago

Dear Liam,

I have noticed that in your make.simmap.R lines 263-277 the correction stated above does not appear. Maybe it is placed somewhere else?

If not, please find below an R implementation of history simulation along a branch that includes the correction in the special case where the parent state differs from the child state. The correction can be seen in lines 23-39.

SM_per_branch_R.R.txt

If you prefer, I could adjust it to match make.simmap.R and then raise it in a pull request.

Cheers, Keren