gonum / matrix

Matrix packages for the Go language [DEPRECATED]
446 stars 53 forks source link

mat64: add generalised singular value decomposition #433

Closed kortschak closed 7 years ago

kortschak commented 7 years ago

@btracey @vladimir-ch Please take a look.

Depends on https://github.com/gonum/lapack/pull/249.

This is functional, but I think the API is open for discussion for a few reasons.

  1. Currently there is no interface for the user to gain the permuted alpha/vectors.
  2. I would like to have HOGSVD (GSVD with more than 2 matrices) in the long run (this does not use LAPACK) and it's appealing to me to roll HOGSVD and GSVD together (so that they are actually generalised). This has some unfortunate issues, so may not be ideal.
btracey commented 7 years ago

Do you have a reference you like for part 2? I've never used this decomposition, so I don't understand the tradeoffs involved. Thanks!

kortschak commented 7 years ago

This is by the person who originally devised it.

btracey commented 7 years ago

Oh, hey, Michael Saunders. I saw him last week!

btracey commented 7 years ago

This paper says the HO GSVD is the same as the 2-matrix GSVD. Thus, it seems that there's no harm in making them one function. In that light, perhaps it should be Factorize(mats []Matrix) , Values(s []float64, i int), etc. We can document that at the moment it only works for two matrices, but this gives us an option to backward-compatibly extend to HO GSVD.

kortschak commented 7 years ago

In theory, they would work under the same API, whether that works in practice is not yet clear. I have smooshed together an untested and unoptimised implementation of HOGSVD which I'll add tests to and clean up soon. This may end up in this PR as HOGSVD and we can discuss if and how they can be unified.

kortschak commented 7 years ago

I have run tests on the HOGSVD code and it looks like numerical stability gets in the way of it working. The issue is the A_i*A_j^-1 calculation which I cannot do any other way than getting the inverse. Comparing it to a python implementation using numpy I can see where we are starting to deviate. further investigation shows that numpy calculates the inverse using DGESV rather than DGETRI as we do. @btracey is this likely to be the cause?

btracey commented 7 years ago

For poorly conditioned matrices, I would imagine that could be a difference. That's the problem with poor conditioning -- numerical instability. Based on the documentation, it seems like dgetri should be better as it is explicitly geared toward inverses.

Why can't you do it another way? (A_i * A_j^-1) = (A_j^-T * A_i^T)^T, so

tmp.Solve(A_j.T(), A_i.T())
ans.Clone(tmp.T())
kortschak commented 7 years ago

Thanks. For posterity, here is the python:

import numpy as np
def hogsvd(m):
    a = list()
    for d in m:
        a.append(np.dot(d.T,d))
    #
    s=np.zeros(m[0].shape[1])
    for i in range(len(a)):
        for j in range(i+1,len(a)):
            s = s + np.dot(a[i],np.linalg.inv(a[j])) + np.dot(a[j],np.linalg.inv(a[i]))
    #
    n = len(m)
    s = (n*(n-1))*s
    #
    _, v = np.linalg.eig(s)
    #
    b = list()
    invV = np.linalg.inv(v)
    for d in m:
        b.append(np.dot(invV,d.T).T)
    #
    sigma = list()
    u = list()
    for x in b:
        sig = np.sqrt(np.sum(x*x, axis=0))
        u.append(x/sig)
        sigma.append(np.diag(sig))
    return(u,sigma,v)

That has cleared up the majority of the disagreement. The difference now seems to appear in the call to eig.Factorize where the numpy equivalent above gives for v, e.g.

[[ 0.31897428  0.94749225  0.40192697]
 [ 0.9286267   0.31848712  0.54698766]
 [ 0.18949371  0.0287121   0.7343427 ]]

and gonum gives (for the same input)

⎡-0.947492   0.318974   0.401927⎤
⎢-0.318487   0.928627   0.546988⎥
⎣-0.028712   0.189494   0.734343⎦

Any ideas?

btracey commented 7 years ago

The eigenvectors are the same, although one of them has a sign flip. However, they are in a different order, and I believe they should be returned with the same ordering as the magnitude of the eigenvalue. Do Go and Python disagree on which eigenvalue is bigger, or do they disagree in the association between eigenvalues and eigenvectors?

btracey commented 7 years ago

Ohh, this is using the old Eigen code. Did you try calling the lapack-based Eigenvalue solver to see if that resolves the issue? It's possible there is a bug in the old code.

kortschak commented 7 years ago

Yeah, it's all OK now. They differences (which were surprising because I thought we were using LAPACK) are due to the old Eigen code and were making finding my actual errors harder. I massaged the v and then find the errors. It works now and has been committed so you can see the issues with API unification; I don't think they can be unified.

kortschak commented 7 years ago

The failure above seems to be real and unexpected. I have reproduced it with a direct translation into the Dggsvd3 and Dggsvp3 tests. Filed https://github.com/gonum/lapack/issues/267.

kortschak commented 7 years ago

I had HOGSVD working a while ago, but the new Eigen appears to have broken it. I will look into this at home where I have the working version.

Fixed by #440.

kortschak commented 7 years ago

PTAL

btracey commented 7 years ago

LGTM

kortschak commented 7 years ago

I am waiting on some data to make a pair of examples for HOGSVD and GSVD. I can merge this and make that an issue if you want to move ahead with the monorepoisation. But I will wait for the query above.