Reference-LAPACK / lapack

LAPACK development repository
Other
1.49k stars 435 forks source link

One failure in the tests for the generalized non-symmetric eigenvalue problem ZGGEV #744

Open dbielich opened 1 year ago

dbielich commented 1 year ago

Output of failure within zgd.out:

Testing COMPLEX16. Nonsymmetric-Generalized-Eigenvalue-Problem-driver-zgd.out ZGV drivers: 1 out of 1092 tests failed to pass the threshold. passed: 9390. failing to pass the threshold: 1 Matrix order= 20, type=19, seed=1284,3436,2461,1449, result 5 is 4.504D+15

I am using gcc/10.2.0 and built the library with the example make.inc: make.inc.gfortran within the INSTALL directory.

weslleyspereira commented 1 year ago

Hi! Do you still have this problem?

dbielich commented 1 year ago

I have just cloned the current master and compiled, and yes; this one error is consistent.

I've just compiled using gcc/11.3.0 instead of gcc/10.2.0 as noted earlier.

I will say, this does not impact my current work. I had only noticed the error and thought I should raise a concern/warning. Can you not see this or replicate when you build the library and the tester runs?

weslleyspereira commented 1 year ago

Thanks!

I couldn't reproduce the problem using GCC 11.1.0 in my Ubuntu 20.04. I am installing version 10.2.0 to test.

Is there any specific compilations flags? Did you use makefile or cmake? Debug mode?

dbielich commented 1 year ago

I am using Makefile.

The following will reproduce the issue for me. git clone https://github.com/Reference-LAPACK/lapack.git cd lapack cp INSTALL/make.inc.gfortran make.inc make -j10.

Once finished building and testing the same single error for ZGV pops up. Testing COMPLEX16 Nonsymmetric-Generalized-Eigenvalue-Problem-driver-zgd.out ZGV drivers: 1 out of 1092 tests failed to pass the threshold passed: 9390 failing to pass the threshold: 1

Maybe it is hardware specific? I am using Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz CPUs.

dbielich commented 1 year ago

So ... maybe I shouldn't have done this, but it all seems eigenvalue related.

I decided to follow the commands provided above on my personal Mac instead of the main machine I use.

I do not get the COMPLEX16 error, but instead in DOUBLE there are 5 errors in total.

Testing DOUBLE PRECISION Nonsymmetric-Eigenvalue-ded.out DDRVES: DGEES1 returned INFO= 6. DDRVES: DGEES1 returned INFO= 6. DES: 2 out of 3810 tests failed to pass the threshold DGET24: DGEESX1 returned INFO= 6. DGET24: DGEESX1 returned INFO= 6. DSX: 2 out of 3494 tests failed to pass the threshold passed: 7374 failing to pass the threshold: 4 Info Error: 4

Testing DOUBLE PRECISION Nonsymmetric-Generalized-Eigenvalue-Problem-driver-dgd.out DDRGES3: DGGES3 returned INFO= 9. DGS drivers: 1 out of 1555 tests failed to pass the threshold passed: 8922 failing to pass the threshold: 1 Info Error: 1

I don't know if this is worth mentioning. Maybe what I do is not correct.

martin-frbg commented 1 year ago

I believe this is essentially the same as in #732 (and issues referenced therein) - though one would need to look at the detailed report in testing_results.txt to see if these are indeed "minor" failures as described in the FAQ. With increasingly aggressive code optimization by compilers, and small differences in hardware implementations, register counts etc. the testsuite appears to have become increasingly fragile.

weslleyspereira commented 1 year ago

I am using Makefile.

The following will reproduce the issue for me. git clone https://github.com/Reference-LAPACK/lapack.git cd lapack cp INSTALL/make.inc.gfortran make.inc make -j10.

Once finished building and testing the same single error for ZGV pops up. Testing COMPLEX16 Nonsymmetric-Generalized-Eigenvalue-Problem-driver-zgd.out ZGV drivers: 1 out of 1092 tests failed to pass the threshold passed: 9390 failing to pass the threshold: 1

Maybe it is hardware specific? I am using Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz CPUs.

I couldn't reproduce it in my machine (Ubuntu 20.04) with GCC 10.3 or 11.1. I tried:

I have the same feeling as @martin-frbg. @dbielich, you tested on Mac. Which version? What was the OS in the first machine you used?