Caetanods / ratematrix

Bayesian estimation of the evolutionary rate matrix.
9 stars 3 forks source link

The wish list of enhancements for the package #46

Open Caetanods opened 7 years ago

Caetanods commented 7 years ago

This is a list of enhancements for the package that will be implemented at some point. Please feel free to add any item that you would like to see in the package in the comments for this tread.

1) apply 'checkConvergence' and 'mergePosterior' over lists [ IMPLEMENTED! ] Right now the user needs to "open the list" of mcmc chains to checkConvergence and to mergePosterior. It is easier just to pass the list of mcmc objects to these functions.

2) Try to 'tar' the MCMC files If the system has 'tar' and 'untar' functions the ratematrixMCMC can compress files in order to save space at the end of a run. However, this might be dangerous because dealing with file compression can vary among different systems. Best way is to suggest the use of the feature, keep it FALSE as the default and place a warning so that users will try it out with a test run before using it in prime time.

3) Enhance output from 'checkConvergence' when using a single chain The output now is not very informative. Need to give more information about the test used. Another interesting thing would be to decouple the computation of the ESS from the check convergence function.

4) No print method for prior samples when 'rebuild.R=TRUE' Need to implement a print method when 'rebuild.R=TRUE' in the function 'samplePrior'.

5) Allow for inverse-Gamma and half-Cauchy priors It is usual for a standard deviation to have priors equal to inverse-Gamma and half-Cauchy. Try to implement these options too.

6) Compute the contrasts only once Julien Clavel suggested that the contrasts need to be computed only once and that I just update the matrices at each iteration of the MCMC. I need to take a closer look to see how to incorporate this update.

7) Implement an adaptive MCMC cycle [ NOT NEEDED! ] It would be great to use a adaptive sampling to find the best set of step sizes given the data. This will make sure that the sampler is working great with different types of data. Right now the sampler need to be manually set.

8) Extend 'plotPrior' The 'plotPrior' function now assumes that the priors for the two rate regimes are the same. Need to extend this to work with different priors for each regime. The function only plots the prior distribution for the evolutionary rate matrix. Need to add an option to be able to also plot the prior for the root value. One way would be to have a option choosing between the plot of the prior for the root and the prior for the rate matrix. Also would be great to plot the prior in a bounded region of the parameter space, such that I could plot the posterior alongside the prior. Right now you need to do separated plots.

9) measurement error This is something important to implement. Both 'phytools' and 'mvMORPH' have examples of how to implement those things.

Caetanods commented 7 years ago
  1. Unnecessary error message when trying to check convergence of MCMC chains with different lengths. It would be easy to just check which chain has the least number of generations and subset all the other to the same length before calling 'coda'. This would help to check the convergence of analyses that still running.
Caetanods commented 7 years ago
  1. No easy way to check the evolutionary correlations among the traits. The package estimates the evolutionary correlation among the traits but the output is focused on the covariances. Need to add functions and options to functions to work, test, and visualize results in the form of evolutionary correlations in a easy way.
Caetanods commented 7 years ago
  1. 'plotRatematrix' function need to accept a vector of regime names for the parameter p. Right now the argument only accepts the position (index) for the regimes. But it is much easier to give the names of the regimes.
Caetanods commented 7 years ago
  1. Make a "smart" default prior for the rates of evolution. There is an easy way to make a better default for the prior distribution on the rates of evolution. I can check the likelihood of rates in different order of magnitude. With this I can get an idea of the region (in a very broad sense) of where the parameter estimates for the rates are going to be. With this info I can try to make a better uniform bound for the uniform prior on the rates of evolution. Of course, this is still not going to be anything near the best solution which is the user to set a suitable prior for the analysis.