pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.74k stars 2.02k forks source link

wishart_like and inverse_wishart_like functions #97

Closed aperotte closed 12 years ago

aperotte commented 12 years ago

I think there may be a problem with the functions that return the log likelihood for the wishart and inverse wishart distributions. This can be seen by taking two abritrary positive definite matrices (M1, M2) and an arbitrary value for degress of freedom (nu) (greater than the number of dimensions) then evaluating pymc.wishart_like(M1, nu, M2) and pymc.inverse_wishart_like(numpy.linalg.inv(M1), nu, numpy.linalg.inv(M2)). These two quantities should be equivalent.

fonnesbeck commented 12 years ago

Is there any reason we are enforcing symmetry on inverse-Wishart likelihoods?

aperotte commented 12 years ago

Hi Chris,

For certain applications you need the fully normalized likelihoods for wishart and inverse-wishart. As far as I understand it, a property of the fully normalized wishart and inverse wishart likelihoods is that the likelihood of a martix B under a wishart parameterized by matrix W should be the same as the likelihood of the inverse of B under an inverse-wishart parameterized by the inverse of W.

I was using this as a sanity check for my code and found it to not be true with pymc. I tried to take a look at the source to check the equations, but it seems like it's calling a compiled subroutine.

-Adler

On Sat, Mar 10, 2012 at 8:50 PM, Chris Fonnesbeck < reply@reply.github.com

wrote:

Is there any reason we are enforcing symmetry on inverse-Wishart likelihoods?


Reply to this email directly or view it on GitHub: https://github.com/pymc-devs/pymc/issues/97#issuecomment-4435966

fonnesbeck commented 12 years ago

Are you sure these need to be equivalent? It does not appear so, at least using the Wishart likelihoods in R as comparison:

> X <- matrix(c(2,-.3,-.3,4),2,2)
> T <- matrix(c(1,.3,.3,1),2,2)
> dwish(X,3,T)
[1] 0.003072811
> diwish(solve(X),3,solve(T))
[1] 1.520776
fonnesbeck commented 12 years ago

For reference, our likelihood are in either flib.f or flib_blas.f

aperotte commented 12 years ago

Perhaps I'm mistaken. I thought that was one of the defining properties of the wishart. I will look into it.

Thanks.

On Sun, Mar 11, 2012 at 1:22 PM, Chris Fonnesbeck < reply@reply.github.com

wrote:

For reference, our likelihood are in either flib.f or flib_blas.f


Reply to this email directly or view it on GitHub: https://github.com/pymc-devs/pymc/issues/97#issuecomment-4440463

fonnesbeck commented 12 years ago

If we believe the functional forms on Wikipedia, they are proportional but not exactly equal. If we do believe they are equivalent, as you expressed, the easiest solution would simply be to blow away our inverse wisharts and call the wishart likelihood with the appropriately-transformed elements.

aperotte commented 12 years ago

Hi Chris,

So I think there are two related issues. Whether the implementations in pymc return the correct likelihoods and whether the identity I thought was true is true.

My implementation of the wishart matches the likelihood produced by R (see attached). This doesn't seem to match what is returned by pymc.

My implementation of the inverse wishart (based on the wikipedia and Gelman, Bayesian Data Analysis) also match R. This also doesn't seem to match pymc.

I think that perhaps my problem is in the assumption that the likelihoods should be the same. Given that these are probability densities and not mass functions it makes sense that the densities could be different even though the following statement is true.

S ~ Wishart(Sigma, k) then S^-1 ~ Inv-Wishart(Sigma^-1, k+p+1)

(From Haff, Identities for the inverse wishart ... , 1982)

Best,

-Adler

On Tue, Mar 13, 2012 at 5:07 PM, Chris Fonnesbeck < reply@reply.github.com

wrote:

If we believe the functional forms on Wikipedia, they are proportional but not exactly equal. If we do believe they are equivalent, as you expressed, the easiest solution would simply be to blow away our inverse wisharts and call the wishart likelihood with the appropriately-transformed elements.


Reply to this email directly or view it on GitHub: https://github.com/pymc-devs/pymc/issues/97#issuecomment-4486278

Adler Perotte, MD Postdoctoral Research Fellow Biomedical Informatics Columbia University

fonnesbeck commented 12 years ago

I agree. I get the following with my Wisharts, using the same example as above:

In [3]: X = np.array([[2, -.3],[-.3, 4]])

In [4]: C = np.array([[1, .3], [.3, 1]])

In [5]: np.exp(wishart_like(X, 3, C))
Out[5]: 0.0037631819208097839

Which is close, but not the same, as what the MCMCpack Wisharts give. Now, I have implemented PyMC's Wishart likelihoods twice, once in Fortran and once in Cython, and they both give the same answer. So, the question is, have I simply implemented a different functional form than MCMCpack? I'm not sure. I will dig in and see.

fonnesbeck commented 12 years ago

I think I solved this mystery. The differences between the R (via MCMCpack) and PyMC Wishart likelihoods is due to the fact that PyMC uses the inverse variance matrix, rather than the variance matrix, which is consistent with what BUGS does. Hence, if you invert the variance matrix in R, the results are the same.

R:

 > X <- matrix(c(2,-.3,-.3,4),2,2)
 > C <- matrix(c(1,.3,.3,1),2,2)
> dwish(X,3,solve(C))
0.003763183

PyMC:

In [3]: X = np.array([[2, -.3],[-.3, 4]])

In [4]: C = np.array([[1, .3], [.3, 1]])

In [5]: np.exp(wishart_like(X, 3, C))
    Out[5]: 0.0037631819208097839

Can you verify this (use the most recent PyMC code from GitHub)?

aperotte commented 12 years ago

Hi Chris,

I have confirmed what you have found, but I am still a bit confused about what is a variance matrix and what is an inverse variance matrix. In the sources I have looked at the wishart is typically parameterized with a positive definite matrix (precision-like). Given two covariance-like positive semi-definite matrices, a and b, and their inverses, a_prime and b_prime, the likelihood of a_prime parameterized by b_prime in my implementation is the same as the likelihood of a_prime parameterized by b in pymc. Here is a quick example.

Three arbitrary vectors

x = np.array([5,6]) y = np.array([10,11]) z = np.array([15,16])

Two covariance-like positive semi-definite matrices

a = np.outer(x,x) + np.outer(y,y) b = np.outer(y,y) + np.outer(z,z)

Two precision-like positive definite matrices

a_prime = np.linalg.inv(a) b_prime = np.linalg.inv(b)

Evaluate wishart likelihood

print pymc.wishart_like(a_prime,a_prime.shape[0]+1,b) print wish.wishart_ll(a_prime, b_prime, a_prime.shape[0]+1)

These are equal only if the covariance-like b is provided to pymc instead

of the precision-like b_prime

Best,

-Adler

On Wed, Mar 14, 2012 at 11:30 AM, Chris Fonnesbeck < reply@reply.github.com

wrote:

I think I solved this mystery. The differences between the R (via MCMCpack) and PyMC Wishart likelihoods is due to the fact that PyMC uses the inverse variance matrix, rather than the variance matrix, which is consistent with what BUGS does. Hence, if you invert the variance matrix in R, the results are the same.

R:

   > X <- matrix(c(2,-.3,-.3,4),2,2)
   > C <- matrix(c(1,.3,.3,1),2,2)
   > dwish(X,3,solve(C))
   0.003763183

PyMC:

   In [3]: X = np.array([[2, -.3],[-.3, 4]])

    In [4]: C = np.array([[1, .3], [.3, 1]])

   In [5]: np.exp(wishart_like(X, 3, C))
    Out[5]: 0.0037631819208097839

Can you verify this (use the most recent PyMC code from GitHub).


Reply to this email directly or view it on GitHub: https://github.com/pymc-devs/pymc/issues/97#issuecomment-4500399

fonnesbeck commented 12 years ago

In Bayesian inference, by convention, you often see distributions parameterized by precision (inverse-variance) rather than by variance per se. This is the case in BUGS/JAGS, so I have simply followed this convention, since many (most?) PyMC users have had prior experience with BUGS. From the WinBUGS 1.4 user's guide appendix:

http://cl.ly/231d0Q1D0d3P0j3W0w18

aperotte commented 12 years ago

Hi Chris,

What I'm saying is that I think that the function in that link actually uses a covariance instead of a precision. R looks like a covariance matrix.

-Adler

On Fri, Mar 16, 2012 at 2:18 PM, Chris Fonnesbeck < reply@reply.github.com

wrote:

In Bayesian inference, by convention, you often see distributions parameterized by precision (inverse-variance) rather than by variance per se. This is the case in BUGS/JAGS, so I have simply followed this convention, since many (most?) PyMC users have had prior experience with BUGS. From the WinBUGS 1.4 user's guide appendix:

http://cl.ly/231d0Q1D0d3P0j3W0w18


Reply to this email directly or view it on GitHub: https://github.com/pymc-devs/pymc/issues/97#issuecomment-4545264

fonnesbeck commented 12 years ago

Sorry for the confusion. Let me put it this way, using the parameterization in PyMC and WinBUGS, the Wishart describes the distribution of precision matrices, not covariance matrices. We do have a wishart_cov_like function that is the appropriate wishart likelihood for covariance matrices.

In [14]: np.exp(wishart_cov_like(X, 3, np.linalg.inv(C)))
Out[14]: 0.0037631819208097839

In [15]: np.exp(wishart_like(X, 3, C))
Out[15]: 0.0037631819208097839

Perhaps we should rename them wishart_bugs_like and wishart_like, since wishart_cov_like is the parameterization you see in most textbooks, but is not used in BUGS. e.g. in this model:

multmodel=function() { 
    for( i in 1 : N ) {
        Y[i,1:2] ~ dmnorm(mu[], Phi[,]) 
    }
    mu[1:2] ~ dmnorm(mu0[], prec[,])
    Phi[1:2, 1:2] ~ dwish(Phi0[,], nu0 ) effect <- mu[2] - mu[1]
    Sigma[1:2,1:2] <- inverse(Phi[,])
    rho <- Sigma[1,2]/sqrt(Sigma[1,1]*Sigma[2,2]) Ynew[1:2] ~ dmnorm(mu[], Phi[,])
}

The BUGS version describes precision matrices.

aperotte commented 12 years ago

Hi Chris,

Although I don't use BUGS, the things I'm reading online suggest that BUGS is parameterized by a covariance matrix, not a precision matrix.

This article clarified things greatly for me. http://statacumen.com/2009/07/02/wishart-distribution-in-winbugs-nonstandard-parameterization/

-Adler

On Fri, Mar 16, 2012 at 2:52 PM, Chris Fonnesbeck < reply@reply.github.com

wrote:

Sorry for the confusion. Let me put it this way, using the parameterization in PyMC and WinBUGS, the Wishart describes the distribution of precision matrices, not covariance matrices. We do have a wishart_cov_like function that is the appropriate wishart likelihood for covariance matrices.

   In [14]: np.exp(wishart_cov_like(X, 3, np.linalg.inv(C)))
   Out[14]: 0.0037631819208097839

   In [15]: np.exp(wishart_like(X, 3, C))
   Out[15]: 0.0037631819208097839

Perhaps we should rename them wishart_bugs_like and wishart_like, since wishart_cov_like is the parameterization you see in most textbooks, but is not used in BUGS. e.g. in this model:

   multmodel=function() {
           for( i in 1 : N ) {
                   Y[i,1:2] ~ dmnorm(mu[], Phi[,])
           }
           mu[1:2] ~ dmnorm(mu0[], prec[,])
           Phi[1:2, 1:2] ~ dwish(Phi0[,], nu0 ) effect <- mu[2] - mu[1]
           Sigma[1:2,1:2] <- inverse(Phi[,])
           rho <- Sigma[1,2]/sqrt(Sigma[1,1]*Sigma[2,2]) Ynew[1:2] ~

dmnorm(mu[], Phi[,]) }

The BUGS version describes precision matrices.


Reply to this email directly or view it on GitHub: https://github.com/pymc-devs/pymc/issues/97#issuecomment-4545783

fonnesbeck commented 12 years ago

Fair enough -- its a positive definite matrix, in any case. The important thing is that it models the likelihood of precision matrices, correct? The intent was to make wishart_like in PyMC the same as dwish in BUGS, which I believe is currently the case.

aperotte commented 12 years ago

Yes it is. I think we're on the same page now. I was just confused for a long time because the documentation suggests that it's parameterized by a precision matrix. I guess I am using pymc in a perhaps unconventional way. I am using it for the likelihood functions and the samplers alone rather than for a BUGS style simulation.

Sorry if I'm beating a dead horse here.

Thanks for bearing with me.

-Adler

On Fri, Mar 16, 2012 at 4:13 PM, Chris Fonnesbeck < reply@reply.github.com

wrote:

Fair enough -- its a positive definite matrix, in any case. The important thing is that it models the likelihood of precision matrices, correct? The intent was to make wishart_like in PyMC the same as dwish in BUGS, which I believe is currently the case.


Reply to this email directly or view it on GitHub: https://github.com/pymc-devs/pymc/issues/97#issuecomment-4547283

fonnesbeck commented 12 years ago

It never hurts to make sure. Notice that when you sample from our Wishart, you get:

In [2]: S = rwishart(3, [[2, 0.3], [0.3,1]])

In [3]: S
Out[3]: 
matrix([[ 1.48673513,  1.94098825],
        [ 1.94098825,  4.12478239]])

So, the sample corresponds to the inverse of the matrix that parameterizes the distribution, which is in accordance with the likelihood.