DGGEV routine returns wrong eigenvalues

Iasonaspg commented 1 year ago

Description

Dear all,

We have come across a possible bug in the dggev() routine. We have two $26 \times 26$ matrices A and B for which we want to compute the generalized eigenvalues and the right eigenvectors.

Our matrices in general are not symmetric but in this case they are. So we can validate the results with the dsygvd() routine.

To our surprise, the general method returned wrong values while the QZ algorithm is known to be more stable. We have also validated the correct values of the symmetric routine using the Lanczos algorithm.

Is this a possible bug in the implementation of dggev?

 Library: OpenBLAS 0.3.18
 OS: Ubuntu  20.04.1 with kernel 5.15.0-58-generic.

In the following table, we attach the computed eigenvalues of the two methods.

dsygvd	dggev
0.0000000199	-0.0000003125
0.0000000319	0.0000000199
0.0000000454	0.0000009871
0.0000022240	0.0000036185
0.0000036307	0.0000039723
0.0000045522	0.0000040793
0.0000085986	0.0000073732
0.0000086645	0.0000086645
0.0000087255	0.0000088557
0.0000156566	0.0000155257
0.0000157513	0.0000158595
0.0000390328	0.0000391311
0.0000391115	0.0000396816
0.0001017476	0.0000970233
0.0001018231	0.0001014233
0.0001074471	0.0001074471
0.0004344327	0.0004081962
0.0004344969	0.0004229847
0.0061985159	0.0061712380
0.0061985912	0.0061940932
0.8103727235	0.8103727235
0.8103727388	0.8103727478
0.8103727431	0.8103727687
0.8103727478	0.8103727691
0.8103727702	0.8103727722
0.8103727707	0.8103727730

Please note that while the above values seem very small, in reality we are interested for the inverted values. However, we compute them with this way because we are sure only for the positive definiteness of matrix A and not B. So we solve the system $$Bx = \mu A x$$ and compute the inverted values $\lambda = \frac{1}{\mu}$

We attach a minimal example to reproduce the issue. You can modify the conditional to use either the dggev() or dsygvd().

The code is written in C++ and can be compiled with

g++ eig.cpp -o eig -lopenblas # or any other BLAS/LAPACK variant

The order of the two matrices and their values are stored in column major in the binary file named matrices.bin.

dggev_report.zip

Checklist

[x] I've included a minimal example to reproduce the issue
[ ] I'd be willing to make a PR to solve this issue

martin-frbg commented 1 year ago

Much as I hate creating work for myself : you are obviously using LAPACK through (an outdated version of) OpenBLAS, and DGGEV calls a number of BLAS functions that possibly were/are affected by some defect in a cpu-specific optimized implementation of such BLAS function. Can you please check if this is reproducible with either (a) the unoptimized reference implemetation where you opened this issue (b) the current version (0.3.21) of OpenBLAS ?

Iasonaspg commented 1 year ago

Thank you for your reply. I will follow these 2 steps and inform you.

martin-frbg commented 1 year ago

So I've taken a quick look at this now and get essentially the same discrepancies both with latest OpenBLAS and with pure Reference-LAPACK (at least) back to ~~3.7.1~~ 3.3.1 (Note I have not looked into whether the testcase is valid - the change of sign on the first element of the DGGEV result vector and the exact match between the second one and the first for DSYGVD look suspicious if probably coincidental)

Iasonaspg commented 1 year ago

Thank you for comparing the results with the reference LAPACK. Regarding the testcase, the matrices are symmetric and the "right hand side" matrix is positive definite. So I suppose those matrices are valid for usage in DSYGVD.

I would like to help as much as I can. Please feel free to ask me anything.

langou commented 1 year ago

Hi folks, the one thing that comes to mind, beside a bug, is that the eigenvalues of this problem are ill-conditioned and so two backward stable solvers may return widely different answers. The one thing that worries me is: "We have also validated the correct values of the symmetric routine using the Lanczos algorithm." It bothers me that two eigenvalue solvers (Lanczos and dsygvd) find similar answers and a third one (dggev) does not. (Good idea to use dsygvd when possible. Strongly encouraged!) Well, I am not sure I had much of a contribution on that one. We'll try to look some more into this. Thanks for the bug report. Cheers, Julien.

thijssteel commented 1 year ago

while the QZ algorithm is known to be more stable

This is not necessarily true. Yes, the QZ algorithm will generally avoid issues when the matrix B in the pencil (A,B) is ill-conditioned or singular. However, it does not preserve symmetry. The symmetric perturbations of dsygvd often result in better overall results.

the one thing that comes to mind, beside a bug, is that the eigenvalues of this problem are ill-conditioned and so two backward stable solvers may return widely different answers

I have confirmed that the backward errors of the Schur decompositions are on the order of the machine precision ( 4.49e-16). If the Schur decomposition is correct and the eigenvalues are wrong, i think it can only be the condition.

This makes me think, I see questions like this from time to time. Maybe we need to work on some ways to automatically detect ill-conditioned eigenvalues in generalized eigenvalue problems.

langou commented 1 year ago

This is not necessarily true. Yes, the QZ algorithm will generally avoid issues when the matrix B in the pencil (A,B) is ill-conditioned or singular. However, it does not preserve symmetry. The symmetric perturbations of dsygvd often result in better overall results.

Thanks. You are correct! I agree.

I have confirmed that the backward errors of the Schur decompositions are on the order of the machine precision ( 4.49e-16). If the Schur decomposition is correct and the eigenvalues are wrong, i think it can only be the condition.

Thanks. I agree. I think that @thijssteel's experiment on backward errors closes the discussion.

The generalized eigenvalues of this problem are ill-conditioned. We run two generalized eigensolvers. Both are stable. But because of ill-conditioning, computing generalized eigenvalues are far apart.

This makes me think, I see questions like this from time to time. Maybe we need to work on some ways to automatically detect ill-conditioned eigenvalues in generalized eigenvalue problems.

Indeed, but I would say that this is why we have https://github.com/Reference-LAPACK/lapack/blob/b1e25a3b6700e911a6c175aee28ae6e1d57c24db/SRC/dggevx.f and that this is enough.

Reference-LAPACK / lapack

DGGEV routine returns wrong eigenvalues #789