kiudee commented 9 years ago

I have been converting a model of mine from PyMC 2 to version 3 and I’m encountering a problem in the calculation of the log probability of MvNormal.

In my model data is an n_row x n_col matrix (each row containing an observation of the multivariate normal):

with pm.Model() as model:
    mu = pm.Normal('mu', mu=0, tau=1, shape=n_col)
    prec_matrix = pm.Wishart("prec_matrix", n=n_col+1, V=np.eye(n_col), shape=(n_col, n_col))
    like = pm.MvNormal('likelihood', mu=mu, tau=prec_matrix, observed=data)

When running this example, an error occurs while calculating the dot product dot(delta.T, dot(tau, delta)):

def logp(self, value):
    mu = self.mu
    tau = self.tau

    delta = value - mu
    k = tau.shape[0]

    return 1/2. * (-k * log(2*pi) + log(det(tau)) - dot(delta.T, dot(tau, delta)))

I’m on the following version of Theano:

0.6.0.dev-9db3be23232d1cb4f472b26f2cf7704401cc2306

Here is the complete error:

lib\site-packages\pymc\distributions\multivariate.py in logp(self, value)
     33         k = tau.shape[0]
     34 
---> 35         return 1/2. * (-k * log(2*pi) + log(det(tau)) - dot(delta.T, dot(tau, delta)))
     36 
     37 

lib\site-packages\theano\tensor\basic.pyc in dot(a, b)
   4752         return tensordot(a, b, [[a.ndim - 1], [numpy.maximum(0, b.ndim - 2)]])
   4753     else:
-> 4754         return _dot(a, b)
   4755 
   4756 

lib\site-packages\theano\gof\op.pyc in __call__(self, *inputs, **kwargs)
    539                 thunk.outputs = [storage_map[v] for v in node.outputs]
    540 
--> 541                 required = thunk()
    542                 assert not required  # We provided all inputs
    543 

lib\site-packages\theano\gof\op.pyc in rval(p, i, o, n)
    746 
    747         def rval(p=p, i=node_input_storage, o=node_output_storage, n=node):
--> 748             r = p(n, [x[0] for x in i], o)
    749             for o in node.outputs:
    750                 compute_map[o][0] = True

\lib\site-packages\theano\tensor\basic.pyc in perform(self, node, inp, out)
   4572         # gives a numpy float object but we need to return a 0d
   4573         # ndarray
-> 4574         z[0] = numpy.asarray(numpy.dot(x, y))
   4575 
   4576     def grad(self, inp, grads):

ValueError: matrices are not aligned

twiecki commented 9 years ago

I wonder if that's related to the Wishart distribution for which some issues have been reported.

kiudee commented 9 years ago

The same error happens when I input a fixed precision matrix. As soon as I get home, I'll see if I can fix it.

kiudee commented 9 years ago

As far as I understand it, the problem is that logp(self, value) expects a single observation as input (then the output is the correct log probability). Instead in model.py it gets called with the complete array:

self.logp_elemwiset = distribution.logp(*args)

Since I’m not yet familiar with the inner workings of PyMC, what is the expected input and output of the logp function?

kiudee commented 9 years ago

This would be my first draft for the logp function:

def logp(self, value):
    mu = self.mu
    tau = self.tau

    delta = value - mu
    k = tau.shape[0]
    n = value.shape[0]

    result = n * k * log(2*pi) + n * log(det(tau))
    result += (delta.T * dot(tau, delta.T)).sum()
    return -1/2. * result

Which works for this version of the model:

testdata = np.random.multivariate_normal([1,2,3], np.eye(3),size=1000)
with pm.Model() as model:
    mu = pm.Normal('mu', mu=0, tau=5 ** -2, shape=n_col)
    like = pm.MvNormal('likelihood', mu=mu, tau=pm.constant(np.eye(3)), observed=testdata)

The traces for mu can be seen here: trace_mu

Sadly, this stops working as soon as I include the Wishart prior. I’ll look at it more closely.

kiudee commented 9 years ago

Found another bug: tau is a precision matrix, therefore we need to take the inverse of the determinant:

def logp(self, value):
    mu = self.mu
    tau = self.tau

    delta = value - mu
    k = tau.shape[0]
    n = value.shape[0]

    result = n * k * log(2*pi) + n * log(1./det(tau))
    result += (delta.dot(tau) * delta).sum()
    return -1/2. * result

kiudee commented 9 years ago

After delving a bit deeper into the code of PyMC 3, I realized that the log probabilities are always computed element wise. The adjusted version of MvNormal.logp then looks like:

def logp(self, value):
    mu = self.mu
    tau = self.tau

    delta = value - mu
    k = tau.shape[0]

    result = k * log(2*pi) + log(1./det(tau))
    result += (delta.dot(tau) * delta).sum(axis=1)
    return -1/2. * result

Sadly, even with my fixes to the Wishart distribution the sampler won’t converge to anything remotely useful. I’m open to suggestions, as I’ve already looked over the functions several times. (Note: The problems only occur, when I include the Wishart prior distribution.)

kiudee commented 9 years ago

Update: It has to be the logp of the Wishart distribution. I implemented the complete model in emcee and the multivariate normal distribution works fine. As soon as I add the log probability of the Wishart distribution, it returns nonsensical results.

twiecki commented 9 years ago

Looks like great progress @kiudee. I think @kadeng did some fixes to Wishart in his PR (that's not merge as it has many other parts): https://github.com/kadeng/pymc/blob/master/pymc/distributions/multivariate.py

Perhaps that fixes the remaining issues?

kiudee commented 9 years ago

Thanks for the hint! I‘ll look at it. By the way, I did just implement the Lewandowski, Kurowicka and Joe (LKJ) prior distribution for correlation matrices (Quick overview on CrossValidated). So far it works great in emcee, and I plan to implement it for PyMC soon.

kiudee commented 9 years ago

I matched my code to @kadeng's but I still have the same problem. His code also still calls the multivariate Gamma function with swapped parameters.

The problem (which also occurs in emcee with the Wishart) is that the log probability and the values in the matrix just keep exploding.

twiecki commented 9 years ago

In emcee you implemented your own Wishart?

What about the one in pymc2?

kiudee commented 9 years ago

I just looked at the fortran code of the Wishart distribution (pymc2). I noticed it checks if the matrix is symmetric and positive definite.

Yes, I ported the code to emcee to rule out any sampler causes.

538: I noticed this too. If I should fix it, I will include a line to automatically set the shape.

twiecki commented 9 years ago

But you're saying the Wishart doesn't work in emcee either, right?

Perhaps the half-normal T prior is also worth looking at: http://ba.stat.cmu.edu/journal/2013/vol08/issue02/huang.pdf

twiecki commented 9 years ago

Can you put your current changes up on a branch somewhere?

kiudee commented 9 years ago

Yes, I did not get the Wishart prior to work. The LKJ Prior for the correlation matrix works already. As soon as I have a clean version in PyMC I'll create a separate pull request.

kiudee commented 9 years ago

I created a pull request for the fix of the original problem (‘Matrices not aligned’) in #675.

pymc-devs / pymc

Problem in MvNormal: Matrices not aligned #672

538: I noticed this too. If I should fix it, I will include a line to automatically set the shape.