Closed alekseic closed 5 years ago
Ok. Here's what I found. Suppose we have an SPD matrix called img. Now, follow me.
Method 1: take the matrix logarithm through EIG (not SVD)
w, h = np.linalg.eig(img)
h @ np.diag(np.log(w)) @ h.transpose()
Method 2: take the matrix logarithm through SVD
u2, s2, v2 = np.linalg.svd(img)
u2 @ np.diag(np.log(s2)) @ u2.transpose()
These results are equivalent, and they are both equal (up to some rounding) to the result of scipy.linalg.logm(img).
So nothing wrong with SPDTangentSpace
here.
Now suppose that we took the logarithm with whatever method above, and its not stored in logimg
.
Let's try to go back to img
by taking matrix exponent using several methods like above.
Method1: take the matrix exponent through EIG
w, h = np.linalg.eig(logimg)
h @ np.diag(np.exp(w)) @ h.transpose()
Note that I replaced the log
operator by the exp
operator.
Method 2: take the matrix exponent through SVD
u2, s2, v2 = np.linalg.svd(logimg)
u2 @ np.diag(np.exp(s2)) @ u2.transpose()
And here we find that the results differ! While the first way (with EIG) is equal to scipy.linalg.expm(logimg)
, the second one, which relies on SVD, gives a different (and incorrect) result!
This may be related to the fact that, although if A = A', singular values of A are equal to the absolute values of eigenvalues of A, exp(lambda) is not always equal exp(abs(lambda)).
So I suspect that if we want to continue using SVD in the code of SPDUnTangentSpace
, some things should be changed...
I just tested untangent_space(tangent_space(Y))
. The output is equal, of course with a tolerance of 1e-4, which is neglectable. To see that, please run python tests/test_modules.py
.
Thank you. The source of my confusion was the fact that in some papers people talk about EIG decomposition, while in practice they use SVD decomposition, which is not exactly the same thing. Even Huang and van Gool write about EIG, while their Github code in Matlab relies on SVD :) there are some minor differences in how forward and backprop are computed, it can be found in the paper by Ionescu et al. on matrix backprop.
I performed some neural network training and produced a 50x50 SPD matrix called X which I compared to another SPD Y (ground truth), by putting both of them into Tangent space and computing a standard MSELoss between them. After some successful iterations during which the loss decreased, I decided to transform X back to "normal space" using
untangent_space = SPDUnTangentSpace(unvectorize=False)
. I found that the result ofuntangent_space(X)
is very different from the true Y. To check again, I then performed the back conversion by takingscipy.linalg.expm(X)
instead of applyinguntangent_space
, and then the result was very close to the true Y, as expected.I looked at spd.py, and the
forward
function forSPDUnTangentSpace
seems ok... nothing suspicious. Another interesting thing - when I appliedscipy.linalg.logm
to the result ofuntangent_space(X)
, I obtained a version of X in which all the signs were flipped, as ifscipy.linalg.logm(untangent_space(X)) = -X
. Very puzzling. As if it was a different leaf or something.I cannot give you the exact matrices now for experimenting, but I'll try to come up with a more simple example (say, a 5x5 matrix). Maybe it has something to do with numerical stability of the SVD, or there's something in tangent_space in utils.py?.. I'll look to it, but meanwhile will refrain from using SPDUnTangentSpace.
UPD: Basically, X has nothing to do with the problem. It boils down to two observations:
untangent_space(tangent_space(Y)) != Y scipy.linalg.logm(untangent_space(tangent_space(Y)) = -scipy.linalg.logm(Y)
By the way, you are using the SVD of a matrix and its diagonal part for computing the log and the exp, while the original paper by Huang and Van Gool used the eigendecomposition. Is it equivalent?