Closed bob-carpenter closed 10 years ago
More from Ben from the mailing list, in response to Asim:
Here is another prior on Covariance matrices that you may find useful.
http://www.uow.edu.au/~mwand/wispap.pdf
Thanks. The authors sent the first draft of that paper to us for comments a few months ago and it is even better now. That said, I'm not sure it is especially applicable to Stan. One of the virtues of that distribution is that stuff is conjugate so it is easy to use in a Gibbs sampler. There is nothing wrong with using conjugate priors in Stan if they correspond to your prior beliefs, but they don't yield any computational benefits for a HMC sampler.
The other thing is that the authors claim that the fact that the correlations are all marginally uniform for a certain choice of a hyperparameter is a good thing. To me, this is a bad thing because marginally uniform correlations are jointly non-uniform in an extreme and implausible way, where the density is concentrated in the corners of the admissible parameter space so that the correlation matrix is almost singular. I much prefer the lkj_corr prior for correlation matrices, which for the default value of its hyperparameter is jointly uniform over correlation matrices of a given size (and for what it is worth, implies that the correlations are marginally beta distributed after a shift-and-scale). Then you could put half-t priors (or something else) on the standard deviations and go from there.
More from Ben on stan-users.
This variance-covariance matrix
[,1] [,2] [,3]
[1,] 2 1 -1
[2,] 1 2 1
[3,] -1 1 2
is positive semi-definite with rank 2. To use it with Stan, it depends on whether the multivariate normal distribution pertains to a likelihood or a prior. If it is a likelihood, you can use any two rows and columns of it to model data vectors of length 2. If it is a prior, then the best thing to do would be to calculate its Cholesky factor with pivoting (see help(chol) in R). In your case, L works out to
[,1] [,2] [,3]
[1,] 1.4142136 0.000000 0
[2,] 0.7071068 1.224745 0
[3,] -0.7071068 1.224745 0
If you declare z to be a vector of length 3 in the parameters block and put iid standard normal priors on its elements, then
L * z
is distributed multivariate normal with mean vector zero and the above variance-covariance matrix. At that point, you just have to prove (or hope) that the posterior is proper.M
@bob-carpenter, do you still need this note or can we kill the issue?
I'm killing this issue as it's just a note we'll deal with later.
P.S. Matt Wand's paper is now here:
http://projecteuclid.org/download/pdfview_1/euclid.ba/1369407559
I'm going to start compiling Ben's comments from the mailing lists, which should go into a chapter on covariance and correlation matrices in the manual.
I assigned the issue to Ben, because he knows the most about this area of Stan and of stat modeling. I'm happy to help write it up, but will need supervision.