scipy / scipy

SciPy library main repository
https://scipy.org
BSD 3-Clause "New" or "Revised" License
13.03k stars 5.17k forks source link

BUG: stats.multivariate_normal is not giving the right value for PDF #3823

Closed karidajiang closed 10 years ago

karidajiang commented 10 years ago

The pdf function for multivariate_normal doesn't seem to behave as it should. Here is my test file:

mm = [0,0,0]
sigma = [[1,  0,  1],[0,  1,  1],[0,  0,  1]]

var = multivariate_normal(mean=mm, cov=sigma)                      
print var.pdf([1, 2, 6])

And the output is:

-23.2568155996

However, you can calculate by hand that the log pdf for this data point should be:

-14.2568155996

After looking into the code, I found in class multivariate_normal_gen in _multivariate.py, function _logpdf, it says:

prec_U : ndarray  i.e. inverse of the covariance matrix.

However, when I use the covariance matrix from above and look into this member variable:

(Pdb) prec_U
array([[ 1.,  0.,  0.],
          [ 0.,  1.,  0.],
          [ 0.,  0.,  1.]])

This is not the inverse of the covariance matrix... Hope it helps.

gronchi commented 10 years ago

The covariance matrix should always be a symmetric positive definite matrix. The code probably takes advantages of this fact and it reflects the lower triangular submatrix from the covariance inputs into the upper triangular part to compose the covariance matrix.

argriffing commented 10 years ago

@gronchi is correct. The scipy implementation looks only at the lower triangular part of the matrix because it is assumed to be symmetric, so the matrix in this example is treated the same way as the identity matrix would be treated.

To the extent that scipy has conventions, the scipy convention is apparently to not check for symmetry for inputs like this. For example scipy.linalg.eigh expects hermitian input but by default looks at only the lower part of the input matrix and does not raise errors or warnings if given a non-hermitian input matrix.

karidajiang commented 10 years ago

ohh right, thanks for the clarification! On Jul 23, 2014 9:11 PM, "argriffing" notifications@github.com wrote:

@gronchi https://github.com/gronchi is correct. The scipy implementation looks only at the lower triangular part of the matrix because it is assumed to be symmetric, so the matrix in this example is treated the same way as the identity matrix would be treated.

To the extent that scipy has conventions, the scipy convention is apparently to not check for symmetry for inputs like this. For example scipy.linalg.eigh expects hermitian input but by default looks at only the lower part of the input matrix and does not raise errors or warnings if given a non-hermitian input matrix.

— Reply to this email directly or view it on GitHub https://github.com/scipy/scipy/issues/3823#issuecomment-49965670.