j2kun / svd

Python code implementing the power method for Singular Value Decomposition
https://jeremykun.com/2016/05/16/singular-value-decomposition-part-2-theorem-proof-algorithm/
104 stars 41 forks source link

Fails on square matrices #2

Closed eugenelet closed 6 years ago

eugenelet commented 6 years ago

It seems that this method fails on square matrices (all of the time). Is there anything theoretical that I'm missing about this method?

congxinUSC commented 6 years ago

I think I'v found (part of) the cause for this: In the file svd.py, there is a function SVD_1d. Line 22 should be 'if n >= m:' instead of 'if n > m:'. @eugenelet @j2kun . A lot of other functions depend on this one.

eugenelet commented 6 years ago

@congxinUSC Yeah, I've sent a pull request to get this fixed. Waiting for the admin to respond to that.

congxinUSC commented 6 years ago

@eugenelet Oh! I should have read your pull request earlier, sorry for the duplication. However, I still have bug using svd on non-rank full square matrices, if you have the same issue and some solution plz let me know. Thanks!

eugenelet commented 6 years ago

@congxinUSC What's the problem that you're having for non-full rank matrices?

congxinUSC commented 6 years ago

@eugenelet I didn't make it clear enough, the issue is in my version of implementation so it might be just my bad. Take [[1,2,3], [4,5,6], [7,8,9]] for example, I got the first two singular values correct but the third on is not 0, not even close. I got 4.~

eugenelet commented 6 years ago

@congxinUSC What's the first 2 singular values that you got for your implementation?

congxinUSC commented 6 years ago

Here is the full result, see diag. there are some floating point errors but the third one definitely shouldn't be caused by that. [ 16.84810335261421, 0, 0 ], [ 0, 1.0683695145547085, 0 ], [ 0, 0, 4.4270655276969855 ] ]

congxinUSC commented 6 years ago

4.4270655276969855 @eugenelet

congxinUSC commented 6 years ago

@eugenelet I get [ 16.84810335, 1.06836951, 6.19565208] after using the '=' fixed version of this svd.py, The third value is different form my implementation but it's still wrong. BTW matlab gives [16.8481, 1.0684, 0.0000].

eugenelet commented 6 years ago

@congxinUSC Yeah, I'm still looking into this problem. Let me know if you came across a solution.

congxinUSC commented 6 years ago

@eugenelet @j2kun Sounds like another issue, I'll also try to figure out what's going wrong.

congxinUSC commented 6 years ago

@eugenelet quick but not that elegant workaround, whenever get an eigenvalue higher than the previous one, force it to 0 and short circuit.

eugenelet commented 6 years ago

@congxinUSC Still need a mathematical reason on why is that happening.

j2kun commented 6 years ago

I can confirm that the output of this is nondeterministic.

When computing the third singular value for this matrix [[1,2,3], [4,5,6], [7,8,9]] I get (run a few times, to show it's nondeterministic):

5.00408346
1.87754565
5.07695215
11.89394783
16.00065575
1.68317765

This is due to numerical instability of the part where I subtract the outer product. For instance, when computing the SVD and printing the matrix in each 1d step, on the last one the matrix is reduced to:

[[ -4.44089210e-15  -5.78703752e-15  -6.43929354e-15]
 [ -1.36002321e-15  -1.59247615e-15  -1.85962357e-15]
 [  1.55431223e-15   1.78329573e-15   2.05391260e-15]]

So one fix can be to threshold each entry by epsilon, or threshold the entire matrix based on the max magnitude value. IIRC, if at any step the remaining matrix is zero, then all remaining singular values and vectors are zero.

However, I think this method is inherently not numerically stable. I'm not sure if thresholding is enough to fix the problem. A literature search on numerical stability for the power method might clarify this.

j2kun commented 6 years ago

Closing and moving discussion to the other issue.