Open zhaottcrystal opened 10 years ago
Another thing I am considering is to improve the mcmc part by giving an estimated Rate Matrix Q via the posterior mean when the input is tree structure data and dna and protein sequences. Right now the code will provide each element in the rate matrix and write it in a csv file for every 10 mcmc iterations. But we don't have an overall rate matrix.
This is (slightly) addressed in commit
While it is not dynamic, there is now a suite of diagnostic plots :) Again, not a great option, but you could choose to view the plots at intermediary points...
I am thinking of adding some mcmc convergence test like Gelman-Rubin Diagnostics in Blang. For trees, maybe a test that can calculate the average standard deviation of split frequencies. But I will try to implement Gelman-Rubin test first which looks simpler. The motivation is that for protein sequences data, when I estimate the rate matrix and the tree topology, I usually run a lot of MCMC iterations to ensure convergence. So I want to create some test functions which can let me decide the number of iterations dynamically. If during the running process, the test statistics reflects that the chain has converged then I can stop running the code immediately and save a lot of time. On the other side, if after all the iterations, it has not converged, then I can set the current estimated parameters as the starting state and add more iterations to run. That is what I mean by "dynamically".