e3bo / 2015phylo

Creative Commons Zero v1.0 Universal
0 stars 0 forks source link

Check for convergence of BEAST MCMC chains from multiple starting points #14

Closed e3bo closed 9 years ago

e3bo commented 9 years ago

More generally, give a thorough diagnostic check of all of the MCMC ouput. There are many suggestions here such as

e3bo commented 9 years ago

The Wilson-Balding and Wide Exchange operators have very low acceptance rates. But according to the BEAST2 book, that can happen when some aspects of the toplogy are very well supported. Given that there are several clades with posterior probabilities of 1, maybe that explains this.

Operator analysis Operator Tuning Count Time Time/Op Pr(accept) scale(ac) 0.492 94578 164372 1.74 0.2645
scale(ag) 0.597 94886 164811 1.74 0.2506
scale(at) 0.441 94760 164911 1.74 0.2544
scale(cg) 0.327 94887 164732 1.74 0.2768
scale(gt) 0.566 94555 164071 1.74 0.2588
frequencies 0.013 94439 164502 1.74 0.242
scale(alpha) 0.179 94806 164046 1.73 0.3135
scale(ucld.mean) 0.801 2837505 4948804 1.74 0.229
scale(ucld.stdev) 0.775 2834417 4945060 1.74 0.2354
subtreeSlide(treeModel) 0.177 14180563 2842413 0.2 0.2064
Narrow Exchange(treeModel) 14186006 2748715 0.19 0.2211
Wide Exchange(treeModel) 2839238 286350 0.1 0.0063
wilsonBalding(treeModel) 2836513 846762 0.3 0.0128
scale(treeModel.rootHeight) 0.749 2839033 320739 0.11 0.2243
uniform(nodeHeights(treeModel)) 28366097 8115390 0.29 0.4993
scale(exponential.popSize) 0.529 2834382 63650 0.02 0.2305
scale(exponential.doublingTime) 0.165 2836763 63714 0.02 0.263
up:ucld.mean down:nodeHeights(treeModel) 0.971 2837237 3070905 1.08 0.2281
swapOperator(branchRates.categories) 9454443 3839201 0.41 0.4272
uniformInteger(branchRates.categories) 9454892 2712718 0.29 0.5661

e3bo commented 9 years ago

This looks like it might be a good way to examine convergence of the sample of trees

Chris Whidden and Frederick A. Matsen IV. (2015). Quantifying MCMC Exploration of Phylogenetic Tree Space. Systematic Biology. First published online: January 27, 2015. Supplemental material available at datadryad.org.

https://github.com/cwhidden/sprspace

e3bo commented 9 years ago

It seems all sampled trees have unique topologies, which I suppose follows from many of the sequences being very similar. But I should be able to look at the frequency of toplogies of better supported clades or branches. This program may provide some useful ways of filtering out poorly resolved aspects of the topologies.

e3bo commented 9 years ago

Comparing the densitrees from different runs should indicate whether the distribution of node ages is qualitatively different.