wollmich / metas-unclib-python-wrapper

Python Wrapper for METAS UncLib
1 stars 6 forks source link

SVD unrealistic uncertainties in the left singular vector of low rank matrices #13

Closed ZiadHatab closed 1 year ago

ZiadHatab commented 1 year ago

Hi there!

I tested the SVD option, however it seems to give way off uncertainties in the left singular vectors for low rank matrices. Below is an example of a 4x4 matrix of rank-2. I tested two cases, first with unique eigenvalues and the second with repeated eigenvalues.

import numpy as np
import metas_unclib as munc
munc.use_linprop()

if __name__ == '__main__':    
    a = np.array([1,2,3,4])
    b = np.array([5,6,7,8])
    sigma = 0.1

    print('==== Unique eigenvalues (low rank) ====')
    A = np.outer(b,a) + np.outer(a,b)
    Aunc = munc.ufloatarray(A, covariance=sigma**2*np.eye(np.size(A)))
    U,S,V = munc.ulinalg.svd(Aunc)
    print('Singular values')
    print(munc.get_value(np.diag(S)).round(2))
    print('')
    U_metas_std = munc.get_stdunc(U)
    print('Std of U')
    print(U_metas_std.round(5))

    print('')
    print('++++++++++++++++++++++++')
    print('++++++++++++++++++++++++')
    print('')

    print('==== Repeated eigenvalues (low rank) ====')
    A = np.outer(b,a) - np.outer(a,b)
    Aunc = munc.ufloatarray(A, covariance=sigma**2*np.eye(np.size(A)))
    U,S,V = munc.ulinalg.svd(Aunc)
    print('Singular values')
    print(munc.get_value(np.diag(S)).round(2))
    print('')
    U_metas_std = munc.get_stdunc(U)
    print('Std of U')
    print(U_metas_std.round(5))

The results are below. As you can see, there are quite huge numbers that don't look realistic. I would assume this has to do with repeated eigenvalues, as the problem is present in both cases for the null eigenvalues.

==== Unique eigenvalues (low rank) ====
Singular values
[142.25   2.25   0.     0.  ]

Std of U
[[6.70000000e-04 2.43500000e-02 1.69587422e+12 2.82065746e+13]
 [6.40000000e-04 3.71900000e-02 1.87627738e+13 3.88727962e+13]
 [5.90000000e-04 3.71900000e-02 4.26131703e+13 6.87413152e+12]
 [5.20000000e-04 2.43500000e-02 2.21545222e+13 1.75403531e+13]]

++++++++++++++++++++++++
++++++++++++++++++++++++

==== Repeated eigenvalues (low rank) ====
Singular values
[17.89 17.89  0.    0.  ]

Std of U
[[3.81124760e+13 4.34657703e+13 6.00828786e+13 6.36324701e+13]
 [7.89392666e+12 3.70121669e+13 1.27539348e+14 4.00601599e+13]
 [2.23246226e+13 3.05585635e+13 7.48300596e+13 1.10777091e+14]
 [5.25431719e+13 2.41049601e+13 7.37359053e+12 8.72047803e+13]]
wollmich commented 1 year ago

Thank you for reporting that issue. I'll take a look at it.

I'm using the following SVD algorithm, see

So implemented the above SVD algorithm in METAS UncLib, so that it works with uncertainty objects.

Another way would be matrix derivates, see https://j-towns.github.io/papers/svd-derivative.pdf

Matrix derivates are used for the Eigenvalue decomposition in METAS UncLib.

ZiadHatab commented 1 year ago

Hi!

I don't think the method discussed in https://j-towns.github.io/papers/svd-derivative.pdf would work. This is because in equation (11) you are dividing by the difference between the eigenvalues/singular values. Hence, dividing by zero for the repeated eigenvalues case.

To be honest, I thought this would be a reason for the huge numbers, but I wanted to check first since the code is closed-source. Do you also use the same method to compute the derivatives for the eigenvectors? Or are you considering special cases for eigenvalues that are very close together? When I do eigendecomposition for the example I gave above, the eigenvectors seem to be OK.

Since the code is a wrapper, I can't really help you directly, but I would recommend looking at the issue threads of the JAX and Pytorch packages, as there is a long discussion about these ill-conditioned cases: https://github.com/google/jax/issues/669 https://github.com/pytorch/pytorch/issues/57272

You can also check arXiv, there are some papers on how to handle these cases: https://arxiv.org/abs/2104.03821

There is also an old book by Wilkinson called "The algebraic eigenvalue problem" that discusses the derivative of eigenvectors for many cases, including repeated eigenvalues (you can find a downloadable djvu file by simply googling it).

ZiadHatab commented 1 year ago

Just to update...

I tested the eigenvalue decomposition again with some low-rank matrices and it generated huge numbers for the uncertainties of the eigenvectors. I basically created the matrix from an eigenvalue formulation and then applied metas UncLib. Sometimes it works, other times it doesn't.

Here's an example where it doesn't work. You can test other cases just by playing with the eigenvalues.

import numpy as np
import metas_unclib as munc
munc.use_linprop()

if __name__ == '__main__':
    sigma = 0.1
    # full rank matrix
    Q = np.array([
    [2, 1, 3, 0, 4, 1],
    [3, 0, 1, 2, 1, 3],
    [1, 2, 0, 3, 0, 2],
    [2, 1, 2, 1, 1, 1],
    [0, 3, 0, 2, 2, 1],
    [1, 2, 1, 0, 3, 2]])
    # create low-rank matrix with eigenvalue formulation
    A = Q@np.diag([1,1,1,1,0,0])@np.linalg.inv(Q)

    Aunc = munc.ufloatarray(A, covariance=sigma**2*np.eye(np.size(A)))
    V,S  = munc.ulinalg.eig(Aunc)
    S = np.diag(S)
    print('Eigenvalues')
    print(munc.get_value(np.diag(S)).round(2))
    print('')
    U_metas_std = munc.get_stdunc(V)
    print('Std of V')
    print(U_metas_std.round(5))

which results in

Eigenvalues
[ 0. -0.  1.  1.  1.  1.]

Std of V
[[6.24400000e-02 5.25300000e-02 1.88733690e+13 1.85784992e+13 1.42707688e+13 2.71100000e-02]
 [4.73500000e-02 6.70800000e-02 1.20734268e+13 1.37885924e+13 9.83593547e+12 1.68400000e-02]
 [6.77900000e-02 7.11300000e-02 2.33117463e+13 2.38152901e+13 2.51588123e+13 1.18300000e-02]
 [9.58200000e-02 6.04500000e-02 1.32206233e+13 1.31524391e+13 1.16270367e+13 4.51800000e-02]
 [7.56000000e-02 8.79100000e-02 2.36102934e+13 1.64900116e+13 1.59749088e+13 1.75500000e-02]
 [7.41100000e-02 8.34500000e-02 2.43915383e+13 2.84391495e+13 2.82037426e+13 8.78000000e-03]]
wollmich commented 1 year ago

I've tested that as well yesterday and I saw that issue all ready. But thanks again for reporting.

For the eigenvalue decomposition I'm using matrix derivates for uncertainty propagation. It's based on the following paper https://www.janmagnus.nl/papers/JRM011.pdf.

wollmich commented 1 year ago

I tested the eigenvalue decomposition again with some low-rank matrices and it generated huge numbers for the uncertainties of the eigenvectors.

I think I found the bug. The pseudo inverse is used in this paper for the matrix derivate of the eigenvectors.

See the following example in MATLAB using UncLib:

>> a11 = LinProp(1, 0.1)

a11 =

   (1 ± 0.1000)

>> a21 = LinProp(2, 0.1)

a21 =

   (2 ± 0.1000)

>> a31 = LinProp(3, 0.1)

a31 =

   (3 ± 0.1000)

>> a22 = LinProp(4, 0.1)

a22 =

   (4 ± 0.1000)

>> a32 = LinProp(5, 0.1)

a32 =

   (5 ± 0.1000)

>> a33 = LinProp(6, 0.1)

a33 =

   (6 ± 0.1000)

>> A = [a11 a21 a31;a21 a22 a32;a31 a32 a33]

A.Value = 

     1     2     3
     2     4     5
     3     5     6

A.StdUnc = 

    0.1000    0.1000    0.1000
    0.1000    0.1000    0.1000
    0.1000    0.1000    0.1000

>> [V D] = eig(A)

V.Value = 

    0.7370    0.5910    0.3280
    0.3280   -0.7370    0.5910
   -0.5910    0.3280    0.7370

V.StdUnc = 

   1.0e+13 *

    0.0000    0.0000    0.8691
    0.0000    0.0000    1.5660
    0.0000    0.0000    1.9528

D.Value = 

   -0.5157         0         0
         0    0.1709         0
         0         0   11.3448

D.StdUnc = 

    0.1254         0         0
         0    0.1254         0
         0         0    0.1254

>> A2 = [a11 a21 a31;a21 a22 a32;a31 a32 a33+0.001]

A2.Value = 

    1.0000    2.0000    3.0000
    2.0000    4.0000    5.0000
    3.0000    5.0000    6.0010

A2.StdUnc = 

    0.1000    0.1000    0.1000
    0.1000    0.1000    0.1000
    0.1000    0.1000    0.1000

>> [V2 D2] = eig(A2)

V2.Value = 

    0.7372    0.5908    0.3280
    0.3278   -0.7371    0.5910
   -0.5909    0.3281    0.7370

V2.StdUnc = 

    0.0723    0.0913    0.0075
    0.0917    0.0415    0.0056
    0.0397    0.0719    0.0044

D2.Value = 

   -0.5154         0         0
         0    0.1710         0
         0         0   11.3454

D2.StdUnc = 

    0.1254         0         0
         0    0.1254         0
         0         0    0.1254

>> C = get_value(D(3,3)).*eye(3) - get_value(A)

C =

   10.3448   -2.0000   -3.0000
   -2.0000    7.3448   -5.0000
   -3.0000   -5.0000    5.3448

>> inv(C)

ans =

   1.0e+14 *

   -0.2448   -0.4411   -0.5500
   -0.4411   -0.7948   -0.9911
   -0.5500   -0.9911   -1.2359

>> C2 = get_value(D2(3,3)).*eye(3) - get_value(A2)

C2 =

   10.3454   -2.0000   -3.0000
   -2.0000    7.3454   -5.0000
   -3.0000   -5.0000    5.3444

>> inv(C2)
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND
=  4.252824e-17. 

ans =

   1.0e+14 *

   -1.3505   -2.4335   -3.0348
   -2.4335   -4.3850   -5.4685
   -3.0348   -5.4685   -6.8197

>> pinv(C)

ans =

    0.0771   -0.0186   -0.0194
   -0.0186    0.0577   -0.0380
   -0.0194   -0.0380    0.0391

>> pinv(C2)

ans =

    0.0771   -0.0186   -0.0194
   -0.0186    0.0577   -0.0380
   -0.0194   -0.0380    0.0391

>> [Us Ss Vs] = svd(C)

Us =

   -0.7370    0.5910    0.3280
   -0.3280   -0.7370    0.5910
    0.5910    0.3280    0.7370

Ss =

   11.8605         0         0
         0   11.1739         0
         0         0    0.0000

Vs =

   -0.7370    0.5910   -0.3280
   -0.3280   -0.7370   -0.5910
    0.5910    0.3280   -0.7370

>> [Us2 Ss2 Vs2] = svd(C2)

Us2 =

   -0.7372    0.5908    0.3280
   -0.3278   -0.7371    0.5910
    0.5909    0.3281    0.7370

Ss2 =

   11.8607         0         0
         0   11.1743         0
         0         0    0.0000

Vs2 =

   -0.7372    0.5908   -0.3280
   -0.3278   -0.7371   -0.5910
    0.5909    0.3281   -0.7370

>> Ss(3,3)

ans =

   4.7309e-15

>> Ss2(3,3)

ans =

   1.4384e-15

>> rcond(C)

ans =

   2.3468e-16

>> rcond(C2)

ans =

   4.2528e-17

>> eps

ans =

   2.2204e-16

>> 

pinv is the MATLAB command to compute the pseudo inverse and inv to compute the matrix inverse. Intel MKL pseudo inverse routine which is used by METAS UncLib is using all eigenvalues of C and only two eigenvalues of C2. That's the reason why the uncertainties of V are huge and the uncertainties of V2 are not.

In my C# code I'm using Intel MKL for computing the pseudo inverse, see https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/lapack-routines/lapack-least-squares-and-eigenvalue-problem/lapack-least-squares-eigenvalue-problem-driver/linear-least-squares-lls-problems-lapack-driver/gelss.html. If set rcond to 1e-15 and not to -1 it seems to work.

ZiadHatab commented 1 year ago

Thanks for the update. That would make sense if the C# didn't compute the pinv as a generalized inverse. The one in numpy (I suppose also in Matlab) is based on SVD, where they take the inverse of the non-zero singular values, while leaving the zero singular values unchanged https://numpy.org/doc/stable/reference/generated/numpy.linalg.pinv.html

This is equivalent in your case to saying that every time you take the difference between the same eigenvalues, you just set it to zero. A similar derivation was done in this paper https://arxiv.org/abs/2011.04366 (see equations 4.48-4.52).

In your case, do you have access to SVD in C#? if yes, you can compute your own generalized inverse.

wollmich commented 1 year ago

In your case, do you have access to SVD in C#? if yes, you can compute your own generalized inverse.

In my C# code I'm using Intel MKL for computing the pseudo inverse, see https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/lapack-routines/lapack-least-squares-and-eigenvalue-problem/lapack-least-squares-eigenvalue-problem-driver/linear-least-squares-lls-problems-lapack-driver/gelss.html. If set rcond to 1e-15 and not to -1 it works. The method is using SVD and the rcond parameter specifies which eigenvalues are used.

This solved the huge uncertainty problem.

But it didn't solve the low rank problem.

ZiadHatab commented 1 year ago

My mistake. I didn't read the documentation.

Regarding the low-rank issue, you can do as in this paper https://arxiv.org/abs/2011.04366 by forcing the degenerate terms (same eigenvalues difference) to be zero.

The low rank eigenvectors are no different than having repeated eigenvalues, because you can always shift the eigenvalues with some constant diagonal matrix, and this wouldn't change the eigenvectors (the constant will cancel out in the difference calculation).

wollmich commented 1 year ago

Aunc = munc.ufloatarray(A, covariance=sigma**2*np.eye(np.size(A)))

I think the assumption that a low rank has uncorrelated uncertainties is wrong. Because in a low rank matrix either some columns or rows are not linearly independent. E.g.: if two rows are not independent then they must be correlated.

So if I rerun your first example with correlation the SVD and Eigenvalue decomposition work, see the following example in MATLAB using METAS UncLib:

>> a = LinProp([1 2 3 4], 0.01*eye(4));
>> b = LinProp([5 6 7 8], 0.01*eye(4));
>> A = a'*b + b'*a

A.Value = 

    10    16    22    28
    16    24    32    40
    22    32    42    52
    28    40    52    64

A.StdUnc = 

    1.0198    0.8124    0.9165    1.0296
    0.8124    1.2649    0.9899    1.0954
    0.9165    0.9899    1.5232    1.1747
    1.0296    1.0954    1.1747    1.7889

>> [U S V] = svd(A)

U.Value = 

   -0.2830   -0.7873    0.5467    0.0329
   -0.4132   -0.3595   -0.7535    0.3637
   -0.5434    0.0683   -0.1332   -0.8260
   -0.6737    0.4962    0.3400    0.4294

U.StdUnc = 

    0.0095    0.0435    0.0678    0.1865
    0.0091    0.0664    0.1195    0.1900
    0.0084    0.0665    0.2300    0.0415
    0.0073    0.0439    0.0830    0.0778

S.Value = 

  142.2496         0         0         0
         0    2.2496         0         0
         0         0    0.0000         0
         0         0         0    0.0000

S.StdUnc = 

    2.8342         0         0         0
         0    0.3564         0         0
         0         0    0.0000         0
         0         0         0    0.0000

V.Value = 

   -0.2830    0.7873         0    0.5477
   -0.4132    0.3595    0.4082   -0.7303
   -0.5434   -0.0683   -0.8165   -0.1826
   -0.6737   -0.4962    0.4082    0.3651

V.StdUnc = 

    0.0095    0.0435         0    0.0604
    0.0091    0.0664    0.1062    0.0519
    0.0084    0.0665    0.0118    0.0906
    0.0073    0.0439    0.0844    0.0632

>> U*S*V' - A

ans.Value = 

   1.0e-13 *

   -0.1066         0   -0.0711   -0.0355
    0.0355    0.1776    0.1421    0.1421
         0    0.0711         0   -0.0711
    0.0711    0.2132    0.2132    0.1421

ans.StdUnc = 

   1.0e-14 *

    0.0379    0.0862    0.0778    0.2318
    0.0436    0.0949    0.0578    0.0835
    0.0430    0.0798    0.1004    0.1101
    0.0422    0.1018    0.0837    0.1465

>> U'*U - eye(4)

ans.Value = 

   1.0e-15 *

         0   -0.3331    0.2220   -0.1110
   -0.3331   -0.2220    0.1388    0.0278
    0.2220    0.1388   -0.6661    0.0555
   -0.1110    0.0278    0.0555   -0.3331

ans.StdUnc = 

   1.0e-15 *

    0.0107    0.0209    0.0238    0.0921
    0.0209    0.0530    0.0748    0.0264
    0.0238    0.0748    0.3810    0.0799
    0.0921    0.0264    0.0799    0.1105

>> V'*V - eye(4)

ans.Value = 

   1.0e-15 *

    0.2220    0.0555    0.1665   -0.0278
    0.0555    0.4441    0.0278    0.0833
    0.1665    0.0278   -0.4441         0
   -0.0278    0.0833         0    0.8882

ans.StdUnc = 

   1.0e-15 *

    0.0072    0.0208    0.0086    0.0216
    0.0208    0.2306    0.0456    0.0509
    0.0086    0.0456    0.1948    0.0452
    0.0216    0.0509    0.0452    0.2313

>> 
>> [V2 D] = eig(A)

V2.Value = 

    0.7873    0.4486   -0.3142    0.2830
    0.3595   -0.8324    0.0846    0.4132
   -0.0683    0.3189    0.7735    0.5434
   -0.4962    0.0649   -0.5439    0.6737

V2.StdUnc = 

    0.0435    0.0604    0.0604    0.0095
    0.0664    0.0257    0.0257    0.0091
    0.0665    0.0100    0.0100    0.0084
    0.0439    0.0443    0.0443    0.0073

D.Value = 

   -2.2496         0         0         0
         0   -0.0000         0         0
         0         0    0.0000         0
         0         0         0  142.2496

D.StdUnc = 

    0.3564         0         0         0
         0    0.0000         0         0
         0         0    0.0000         0
         0         0         0    2.8342

>> V2*D*V2' - A

ans.Value = 

   1.0e-13 *

    0.0355    0.0355    0.0355    0.1776
    0.0355    0.0711    0.1421    0.2132
    0.0711    0.1421    0.1421    0.2842
    0.1421    0.2132    0.2132    0.2842

ans.StdUnc = 

   1.0e-14 *

    0.1383    0.0738    0.0910    0.0897
    0.0803    0.0456    0.0715    0.0849
    0.0887    0.0720    0.1003    0.0959
    0.0940    0.0874    0.0885    0.1435

>> V2'*V2 - eye(4)

ans.Value = 

   1.0e-15 *

    0.4441   -0.2498   -0.2776    0.0555
   -0.2498         0    0.2429    0.0139
   -0.2776    0.2429    0.4441    0.1665
    0.0555    0.0139    0.1665    0.2220

ans.StdUnc = 

   1.0e-15 *

    0.2313    0.7519    0.8001    0.0311
    0.7519    0.0519    0.0147    0.0237
    0.8001    0.0147    0.0507    0.0181
    0.0311    0.0237    0.0181    0.0025

If you agree I'll close this issue.

ZiadHatab commented 1 year ago

That is a good point, I completely forgot that. Thanks!

One thing that still bothers me is the fact we propagate uncertainties by computing the Jacobian, i.e., Unew = J@Uold@J.T. Even if you start with correlated uncertainties, the computation of the Jacobian shouldn't depend on that fact. If the Jacobian is not the issue, how do we obtain large numbers?

wollmich commented 1 year ago

One thing that still bothers me is the fact we propagate uncertainties by computing the Jacobian, i.e., Unew = J@Uold@J.T. Even if you start with correlated uncertainties, the computation of the Jacobian shouldn't depend on that fact. If the Jacobian is not the issue, how do we obtain large numbers?

This has to do how METAS UncLib handles internally the Jacobian matrix.

E.g.: The SVD of the above matrix A:

If the matrix A is treated as uncorrelated then we have 16 input quantities. And we have 2*16+4 = 36 output quantities. So the size of the Jacobian matrix is 36 x 16.

In the case the matrix A is based on 8 uncorrelated inputs the size of the Jacobian matrix is 36 x 8.

METAS UncLib is updating the Jacobian matrix using the chain rule at each computational step.

wollmich commented 1 year ago

See as well https://iopscience.iop.org/article/10.1088/0026-1394/49/6/809/pdf

ZiadHatab commented 1 year ago

Oh yes, you are right. I forgot that the size of J depends on the assumption of which parameters are basic and which are derived. Thanks for the clarification.