Closed ZiadHatab closed 1 year ago
Thank you for reporting that issue. I'll take a look at it.
I'm using the following SVD algorithm, see
So implemented the above SVD algorithm in METAS UncLib, so that it works with uncertainty objects.
Another way would be matrix derivates, see https://j-towns.github.io/papers/svd-derivative.pdf
Matrix derivates are used for the Eigenvalue decomposition in METAS UncLib.
Hi!
I don't think the method discussed in https://j-towns.github.io/papers/svd-derivative.pdf would work. This is because in equation (11) you are dividing by the difference between the eigenvalues/singular values. Hence, dividing by zero for the repeated eigenvalues case.
To be honest, I thought this would be a reason for the huge numbers, but I wanted to check first since the code is closed-source. Do you also use the same method to compute the derivatives for the eigenvectors? Or are you considering special cases for eigenvalues that are very close together? When I do eigendecomposition for the example I gave above, the eigenvectors seem to be OK.
Since the code is a wrapper, I can't really help you directly, but I would recommend looking at the issue threads of the JAX and Pytorch packages, as there is a long discussion about these ill-conditioned cases: https://github.com/google/jax/issues/669 https://github.com/pytorch/pytorch/issues/57272
You can also check arXiv, there are some papers on how to handle these cases: https://arxiv.org/abs/2104.03821
There is also an old book by Wilkinson called "The algebraic eigenvalue problem" that discusses the derivative of eigenvectors for many cases, including repeated eigenvalues (you can find a downloadable djvu file by simply googling it).
Just to update...
I tested the eigenvalue decomposition again with some low-rank matrices and it generated huge numbers for the uncertainties of the eigenvectors. I basically created the matrix from an eigenvalue formulation and then applied metas UncLib. Sometimes it works, other times it doesn't.
Here's an example where it doesn't work. You can test other cases just by playing with the eigenvalues.
import numpy as np
import metas_unclib as munc
munc.use_linprop()
if __name__ == '__main__':
sigma = 0.1
# full rank matrix
Q = np.array([
[2, 1, 3, 0, 4, 1],
[3, 0, 1, 2, 1, 3],
[1, 2, 0, 3, 0, 2],
[2, 1, 2, 1, 1, 1],
[0, 3, 0, 2, 2, 1],
[1, 2, 1, 0, 3, 2]])
# create low-rank matrix with eigenvalue formulation
A = Q@np.diag([1,1,1,1,0,0])@np.linalg.inv(Q)
Aunc = munc.ufloatarray(A, covariance=sigma**2*np.eye(np.size(A)))
V,S = munc.ulinalg.eig(Aunc)
S = np.diag(S)
print('Eigenvalues')
print(munc.get_value(np.diag(S)).round(2))
print('')
U_metas_std = munc.get_stdunc(V)
print('Std of V')
print(U_metas_std.round(5))
which results in
Eigenvalues
[ 0. -0. 1. 1. 1. 1.]
Std of V
[[6.24400000e-02 5.25300000e-02 1.88733690e+13 1.85784992e+13 1.42707688e+13 2.71100000e-02]
[4.73500000e-02 6.70800000e-02 1.20734268e+13 1.37885924e+13 9.83593547e+12 1.68400000e-02]
[6.77900000e-02 7.11300000e-02 2.33117463e+13 2.38152901e+13 2.51588123e+13 1.18300000e-02]
[9.58200000e-02 6.04500000e-02 1.32206233e+13 1.31524391e+13 1.16270367e+13 4.51800000e-02]
[7.56000000e-02 8.79100000e-02 2.36102934e+13 1.64900116e+13 1.59749088e+13 1.75500000e-02]
[7.41100000e-02 8.34500000e-02 2.43915383e+13 2.84391495e+13 2.82037426e+13 8.78000000e-03]]
I've tested that as well yesterday and I saw that issue all ready. But thanks again for reporting.
For the eigenvalue decomposition I'm using matrix derivates for uncertainty propagation. It's based on the following paper https://www.janmagnus.nl/papers/JRM011.pdf.
I tested the eigenvalue decomposition again with some low-rank matrices and it generated huge numbers for the uncertainties of the eigenvectors.
I think I found the bug. The pseudo inverse is used in this paper for the matrix derivate of the eigenvectors.
See the following example in MATLAB using UncLib:
>> a11 = LinProp(1, 0.1)
a11 =
(1 ± 0.1000)
>> a21 = LinProp(2, 0.1)
a21 =
(2 ± 0.1000)
>> a31 = LinProp(3, 0.1)
a31 =
(3 ± 0.1000)
>> a22 = LinProp(4, 0.1)
a22 =
(4 ± 0.1000)
>> a32 = LinProp(5, 0.1)
a32 =
(5 ± 0.1000)
>> a33 = LinProp(6, 0.1)
a33 =
(6 ± 0.1000)
>> A = [a11 a21 a31;a21 a22 a32;a31 a32 a33]
A.Value =
1 2 3
2 4 5
3 5 6
A.StdUnc =
0.1000 0.1000 0.1000
0.1000 0.1000 0.1000
0.1000 0.1000 0.1000
>> [V D] = eig(A)
V.Value =
0.7370 0.5910 0.3280
0.3280 -0.7370 0.5910
-0.5910 0.3280 0.7370
V.StdUnc =
1.0e+13 *
0.0000 0.0000 0.8691
0.0000 0.0000 1.5660
0.0000 0.0000 1.9528
D.Value =
-0.5157 0 0
0 0.1709 0
0 0 11.3448
D.StdUnc =
0.1254 0 0
0 0.1254 0
0 0 0.1254
>> A2 = [a11 a21 a31;a21 a22 a32;a31 a32 a33+0.001]
A2.Value =
1.0000 2.0000 3.0000
2.0000 4.0000 5.0000
3.0000 5.0000 6.0010
A2.StdUnc =
0.1000 0.1000 0.1000
0.1000 0.1000 0.1000
0.1000 0.1000 0.1000
>> [V2 D2] = eig(A2)
V2.Value =
0.7372 0.5908 0.3280
0.3278 -0.7371 0.5910
-0.5909 0.3281 0.7370
V2.StdUnc =
0.0723 0.0913 0.0075
0.0917 0.0415 0.0056
0.0397 0.0719 0.0044
D2.Value =
-0.5154 0 0
0 0.1710 0
0 0 11.3454
D2.StdUnc =
0.1254 0 0
0 0.1254 0
0 0 0.1254
>> C = get_value(D(3,3)).*eye(3) - get_value(A)
C =
10.3448 -2.0000 -3.0000
-2.0000 7.3448 -5.0000
-3.0000 -5.0000 5.3448
>> inv(C)
ans =
1.0e+14 *
-0.2448 -0.4411 -0.5500
-0.4411 -0.7948 -0.9911
-0.5500 -0.9911 -1.2359
>> C2 = get_value(D2(3,3)).*eye(3) - get_value(A2)
C2 =
10.3454 -2.0000 -3.0000
-2.0000 7.3454 -5.0000
-3.0000 -5.0000 5.3444
>> inv(C2)
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND
= 4.252824e-17.
ans =
1.0e+14 *
-1.3505 -2.4335 -3.0348
-2.4335 -4.3850 -5.4685
-3.0348 -5.4685 -6.8197
>> pinv(C)
ans =
0.0771 -0.0186 -0.0194
-0.0186 0.0577 -0.0380
-0.0194 -0.0380 0.0391
>> pinv(C2)
ans =
0.0771 -0.0186 -0.0194
-0.0186 0.0577 -0.0380
-0.0194 -0.0380 0.0391
>> [Us Ss Vs] = svd(C)
Us =
-0.7370 0.5910 0.3280
-0.3280 -0.7370 0.5910
0.5910 0.3280 0.7370
Ss =
11.8605 0 0
0 11.1739 0
0 0 0.0000
Vs =
-0.7370 0.5910 -0.3280
-0.3280 -0.7370 -0.5910
0.5910 0.3280 -0.7370
>> [Us2 Ss2 Vs2] = svd(C2)
Us2 =
-0.7372 0.5908 0.3280
-0.3278 -0.7371 0.5910
0.5909 0.3281 0.7370
Ss2 =
11.8607 0 0
0 11.1743 0
0 0 0.0000
Vs2 =
-0.7372 0.5908 -0.3280
-0.3278 -0.7371 -0.5910
0.5909 0.3281 -0.7370
>> Ss(3,3)
ans =
4.7309e-15
>> Ss2(3,3)
ans =
1.4384e-15
>> rcond(C)
ans =
2.3468e-16
>> rcond(C2)
ans =
4.2528e-17
>> eps
ans =
2.2204e-16
>>
pinv
is the MATLAB command to compute the pseudo inverse and inv
to compute the matrix inverse. Intel MKL pseudo inverse routine which is used by METAS UncLib is using all eigenvalues of C
and only two eigenvalues of C2
. That's the reason why the uncertainties of V
are huge and the uncertainties of V2
are not.
In my C# code I'm using Intel MKL for computing the pseudo inverse, see https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/lapack-routines/lapack-least-squares-and-eigenvalue-problem/lapack-least-squares-eigenvalue-problem-driver/linear-least-squares-lls-problems-lapack-driver/gelss.html.
If set rcond
to 1e-15 and not to -1 it seems to work.
Thanks for the update. That would make sense if the C# didn't compute the pinv as a generalized inverse. The one in numpy (I suppose also in Matlab) is based on SVD, where they take the inverse of the non-zero singular values, while leaving the zero singular values unchanged https://numpy.org/doc/stable/reference/generated/numpy.linalg.pinv.html
This is equivalent in your case to saying that every time you take the difference between the same eigenvalues, you just set it to zero. A similar derivation was done in this paper https://arxiv.org/abs/2011.04366 (see equations 4.48-4.52).
In your case, do you have access to SVD in C#? if yes, you can compute your own generalized inverse.
In your case, do you have access to SVD in C#? if yes, you can compute your own generalized inverse.
In my C# code I'm using Intel MKL for computing the pseudo inverse, see https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/lapack-routines/lapack-least-squares-and-eigenvalue-problem/lapack-least-squares-eigenvalue-problem-driver/linear-least-squares-lls-problems-lapack-driver/gelss.html. If set rcond to 1e-15 and not to -1 it works. The method is using SVD and the rcond parameter specifies which eigenvalues are used.
This solved the huge uncertainty problem.
But it didn't solve the low rank problem.
My mistake. I didn't read the documentation.
Regarding the low-rank issue, you can do as in this paper https://arxiv.org/abs/2011.04366 by forcing the degenerate terms (same eigenvalues difference) to be zero.
The low rank eigenvectors are no different than having repeated eigenvalues, because you can always shift the eigenvalues with some constant diagonal matrix, and this wouldn't change the eigenvectors (the constant will cancel out in the difference calculation).
Aunc = munc.ufloatarray(A, covariance=sigma**2*np.eye(np.size(A)))
I think the assumption that a low rank has uncorrelated uncertainties is wrong. Because in a low rank matrix either some columns or rows are not linearly independent. E.g.: if two rows are not independent then they must be correlated.
So if I rerun your first example with correlation the SVD and Eigenvalue decomposition work, see the following example in MATLAB using METAS UncLib:
>> a = LinProp([1 2 3 4], 0.01*eye(4));
>> b = LinProp([5 6 7 8], 0.01*eye(4));
>> A = a'*b + b'*a
A.Value =
10 16 22 28
16 24 32 40
22 32 42 52
28 40 52 64
A.StdUnc =
1.0198 0.8124 0.9165 1.0296
0.8124 1.2649 0.9899 1.0954
0.9165 0.9899 1.5232 1.1747
1.0296 1.0954 1.1747 1.7889
>> [U S V] = svd(A)
U.Value =
-0.2830 -0.7873 0.5467 0.0329
-0.4132 -0.3595 -0.7535 0.3637
-0.5434 0.0683 -0.1332 -0.8260
-0.6737 0.4962 0.3400 0.4294
U.StdUnc =
0.0095 0.0435 0.0678 0.1865
0.0091 0.0664 0.1195 0.1900
0.0084 0.0665 0.2300 0.0415
0.0073 0.0439 0.0830 0.0778
S.Value =
142.2496 0 0 0
0 2.2496 0 0
0 0 0.0000 0
0 0 0 0.0000
S.StdUnc =
2.8342 0 0 0
0 0.3564 0 0
0 0 0.0000 0
0 0 0 0.0000
V.Value =
-0.2830 0.7873 0 0.5477
-0.4132 0.3595 0.4082 -0.7303
-0.5434 -0.0683 -0.8165 -0.1826
-0.6737 -0.4962 0.4082 0.3651
V.StdUnc =
0.0095 0.0435 0 0.0604
0.0091 0.0664 0.1062 0.0519
0.0084 0.0665 0.0118 0.0906
0.0073 0.0439 0.0844 0.0632
>> U*S*V' - A
ans.Value =
1.0e-13 *
-0.1066 0 -0.0711 -0.0355
0.0355 0.1776 0.1421 0.1421
0 0.0711 0 -0.0711
0.0711 0.2132 0.2132 0.1421
ans.StdUnc =
1.0e-14 *
0.0379 0.0862 0.0778 0.2318
0.0436 0.0949 0.0578 0.0835
0.0430 0.0798 0.1004 0.1101
0.0422 0.1018 0.0837 0.1465
>> U'*U - eye(4)
ans.Value =
1.0e-15 *
0 -0.3331 0.2220 -0.1110
-0.3331 -0.2220 0.1388 0.0278
0.2220 0.1388 -0.6661 0.0555
-0.1110 0.0278 0.0555 -0.3331
ans.StdUnc =
1.0e-15 *
0.0107 0.0209 0.0238 0.0921
0.0209 0.0530 0.0748 0.0264
0.0238 0.0748 0.3810 0.0799
0.0921 0.0264 0.0799 0.1105
>> V'*V - eye(4)
ans.Value =
1.0e-15 *
0.2220 0.0555 0.1665 -0.0278
0.0555 0.4441 0.0278 0.0833
0.1665 0.0278 -0.4441 0
-0.0278 0.0833 0 0.8882
ans.StdUnc =
1.0e-15 *
0.0072 0.0208 0.0086 0.0216
0.0208 0.2306 0.0456 0.0509
0.0086 0.0456 0.1948 0.0452
0.0216 0.0509 0.0452 0.2313
>>
>> [V2 D] = eig(A)
V2.Value =
0.7873 0.4486 -0.3142 0.2830
0.3595 -0.8324 0.0846 0.4132
-0.0683 0.3189 0.7735 0.5434
-0.4962 0.0649 -0.5439 0.6737
V2.StdUnc =
0.0435 0.0604 0.0604 0.0095
0.0664 0.0257 0.0257 0.0091
0.0665 0.0100 0.0100 0.0084
0.0439 0.0443 0.0443 0.0073
D.Value =
-2.2496 0 0 0
0 -0.0000 0 0
0 0 0.0000 0
0 0 0 142.2496
D.StdUnc =
0.3564 0 0 0
0 0.0000 0 0
0 0 0.0000 0
0 0 0 2.8342
>> V2*D*V2' - A
ans.Value =
1.0e-13 *
0.0355 0.0355 0.0355 0.1776
0.0355 0.0711 0.1421 0.2132
0.0711 0.1421 0.1421 0.2842
0.1421 0.2132 0.2132 0.2842
ans.StdUnc =
1.0e-14 *
0.1383 0.0738 0.0910 0.0897
0.0803 0.0456 0.0715 0.0849
0.0887 0.0720 0.1003 0.0959
0.0940 0.0874 0.0885 0.1435
>> V2'*V2 - eye(4)
ans.Value =
1.0e-15 *
0.4441 -0.2498 -0.2776 0.0555
-0.2498 0 0.2429 0.0139
-0.2776 0.2429 0.4441 0.1665
0.0555 0.0139 0.1665 0.2220
ans.StdUnc =
1.0e-15 *
0.2313 0.7519 0.8001 0.0311
0.7519 0.0519 0.0147 0.0237
0.8001 0.0147 0.0507 0.0181
0.0311 0.0237 0.0181 0.0025
If you agree I'll close this issue.
That is a good point, I completely forgot that. Thanks!
One thing that still bothers me is the fact we propagate uncertainties by computing the Jacobian, i.e., Unew = J@Uold@J.T
. Even if you start with correlated uncertainties, the computation of the Jacobian shouldn't depend on that fact. If the Jacobian is not the issue, how do we obtain large numbers?
One thing that still bothers me is the fact we propagate uncertainties by computing the Jacobian, i.e.,
Unew = J@Uold@J.T
. Even if you start with correlated uncertainties, the computation of the Jacobian shouldn't depend on that fact. If the Jacobian is not the issue, how do we obtain large numbers?
This has to do how METAS UncLib handles internally the Jacobian matrix.
E.g.: The SVD of the above matrix A
:
If the matrix A is treated as uncorrelated then we have 16 input quantities. And we have 2*16+4 = 36 output quantities. So the size of the Jacobian matrix is 36 x 16.
In the case the matrix A is based on 8 uncorrelated inputs the size of the Jacobian matrix is 36 x 8.
METAS UncLib is updating the Jacobian matrix using the chain rule at each computational step.
Oh yes, you are right. I forgot that the size of J depends on the assumption of which parameters are basic and which are derived. Thanks for the clarification.
Hi there!
I tested the SVD option, however it seems to give way off uncertainties in the left singular vectors for low rank matrices. Below is an example of a 4x4 matrix of rank-2. I tested two cases, first with unique eigenvalues and the second with repeated eigenvalues.
The results are below. As you can see, there are quite huge numbers that don't look realistic. I would assume this has to do with repeated eigenvalues, as the problem is present in both cases for the null eigenvalues.