covar and correlation distribution chapter in manual

bob-carpenter commented 11 years ago

I'm going to start compiling Ben's comments from the mailing lists, which should go into a chapter on covariance and correlation matrices in the manual.

I assigned the issue to Ben, because he knows the most about this area of Stan and of stat modeling. I'm happy to help write it up, but will need supervision.

Perhaps it would clarify to say that a jointly uniform distribution for a correlation matrix of order K is proper. Think about putting all correlation matrices (square, symmetric, positive definite, unit-diagonal) of order K into an urn and picking one out.
A jointly uniform prior for a covariance matrix would be improper basically because a uniform prior (over the positive reals) on a variance is improper. So, we can't really write down a useful expression for that. In some cases (multivariate normals), it is known that in improper prior on a covariance matrix nevertheless yields a proper posterior distribution (which Stan needs). But in general, that might not be true.

bob-carpenter commented 11 years ago

More from Ben from the mailing list, in response to Asim:

Here is another prior on Covariance matrices that you may find useful.

http://www.uow.edu.au/~mwand/wispap.pdf

Thanks. The authors sent the first draft of that paper to us for comments a few months ago and it is even better now. That said, I'm not sure it is especially applicable to Stan. One of the virtues of that distribution is that stuff is conjugate so it is easy to use in a Gibbs sampler. There is nothing wrong with using conjugate priors in Stan if they correspond to your prior beliefs, but they don't yield any computational benefits for a HMC sampler.

The other thing is that the authors claim that the fact that the correlations are all marginally uniform for a certain choice of a hyperparameter is a good thing. To me, this is a bad thing because marginally uniform correlations are jointly non-uniform in an extreme and implausible way, where the density is concentrated in the corners of the admissible parameter space so that the correlation matrix is almost singular. I much prefer the lkj_corr prior for correlation matrices, which for the default value of its hyperparameter is jointly uniform over correlation matrices of a given size (and for what it is worth, implies that the correlations are marginally beta distributed after a shift-and-scale). Then you could put half-t priors (or something else) on the standard deviations and go from there.

bob-carpenter commented 10 years ago

More from Ben on stan-users.

This variance-covariance matrix

     [,1] [,2] [,3]
[1,]    2    1   -1
[2,]    1    2    1
[3,]   -1    1    2

is positive semi-definite with rank 2. To use it with Stan, it depends on whether the multivariate normal distribution pertains to a likelihood or a prior. If it is a likelihood, you can use any two rows and columns of it to model data vectors of length 2. If it is a prior, then the best thing to do would be to calculate its Cholesky factor with pivoting (see help(chol) in R). In your case, L works out to

           [,1]     [,2] [,3]
[1,]  1.4142136 0.000000    0
[2,]  0.7071068 1.224745    0
[3,] -0.7071068 1.224745    0

If you declare z to be a vector of length 3 in the parameters block and put iid standard normal priors on its elements, then

L * z

is distributed multivariate normal with mean vector zero and the above variance-covariance matrix. At that point, you just have to prove (or hope) that the posterior is proper.M

syclik commented 10 years ago

@bob-carpenter, do you still need this note or can we kill the issue?

bob-carpenter commented 10 years ago

I'm killing this issue as it's just a note we'll deal with later.

P.S. Matt Wand's paper is now here:

http://projecteuclid.org/download/pdfview_1/euclid.ba/1369407559

stan-dev / stan

covar and correlation distribution chapter in manual #218