open-connectome-classes / StatConn-Spring-2015-Info

introductory material
18 stars 4 forks source link

Positive Definite Matrices #72

Open whock opened 9 years ago

whock commented 9 years ago

When generating random matrices to serve as the adjacency matrix for a graph, I sometimes get an error where kmeans fails to run(scipy.cluster.vq.kmeans2 if that matters). The error message is "Matrix is not positive definite." I looked up this term online but am still a bit confused - so what makes a matrix positive definite and why is this necessary for kmeans to run? Thanks

DSP137 commented 9 years ago

So I'm not sure yet about kmeans, but I can tell you what a positive definite matrix is. So we say that a matrix A is positive definite if all of its eigenvalues are positive real numbers and the matrix is symmetric. Equivalently, if A is symmetric, it is also positive definite if for any vector x, x'Ax is a positive number. I hope this helps!

yaxigeigei commented 9 years ago

Hi, Will. An intuitive example can be found in the Reading 3 of Quantatative Methods of Brain Sciences course we are taking. Figure 2 & 3 on page 8 show various kind of geometric transformations. The transformation matrix for rotation, but not others, has complex number eigenvalues. Well, maybe its not so intuitive...

DSP137 commented 9 years ago

I should add, here I am also assuming that A and x are real-valued.

mrjiaruiwang commented 9 years ago

There's four classifications: positive definite, positive semidefinite, negative semidefinite, negative definite. Respectively, eigenvalues are > 0, >=0, <= 0, < 0.

There's another definition: A symmetric matrix A in R^(n x n) is positive definite if for all z in R^n, z' A z > 0. Semidefinite is >= 0, and so on for the other conditions.

jovo commented 9 years ago

@whock can you post your script to the repo or a gist? it is not clear to me why a matrix not being positive definite would impact a kmeans algorithm, which should not even check for that...

whock commented 9 years ago

I found a stackoverflow thread that asked a similar question and I think the issue is with the code of that implementation of kmeans. There's a bug in the kmeans code that raises that error and it's not due to kmeans "purposefully" trying to check for positive definiteness.

So maybe as a PSA: don't use python's scipy.cluster.vq.kmeans2 as it has a bug in it according to SO.