Open bartoldeman opened 2 years ago
Aggressive compiler optimization and/or the usage of a highly optimized BLAS is known sources for failures in the stringent LAPACK tests. See for example the LAPACK test summary shown in https://github.com/xianyi/OpenBLAS/pull/3609. The failures do not necessarily mean that something is fundamentally wrong.
How do I interpret LAPACK testing failures? [...] Testing failures can be divided into two categories. Minor testing failures, and major testing failures. A minor testing failure is one in which the test ratio reported in the LAPACK/TESTING/.out file slightly exceeds the threshold (specified in the associated LAPACK/TESTING/.in file). The cause of such failures can mainly be attributed to differences in the implementation of math libraries (square root, absolute value, complex division, complex absolute value, etc). These failures are negligible, and do not affect the proper functioning of the library.
The failures related to laqr
fall into the category of the minor failures. The flag march=haswell
activates vectorization and heavy FMA usage. With aggressive compiler optimization, many tests are slightly above the threshold. When you apply the laqr
patch, one instruction is saved per update. That means one instruction less to accumulate errors - just enough to fall below the tight error threshold. This was actually one of the intentions behind this patch.
Description
Tested on
master
, compiling LAPACK withbrings about many test failures:
There are three categories of failures here: 1) A bug in GCC, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107254. Fortunately a fix is provided there and it can be trivially backported. 2) Two minor test failures in REAL and COMPLEX, see https://github.com/Reference-LAPACK/lapack/issues/679 3) The tests insisting that when e.g.
CGEEV
is called to provide eigenvectors, it will give the exact same eigenvalues (no numerical tolerance) as when it does not provide eigenvectors.Applying the patch to GCC, tests go down to:
Increasing the tolerance for 2) using
we get
The third one is more complex to analyze. The way I understand it is that gfortran has some freedom in how to apply FMA's to expressions such as
a*b+c*d
(ie.fma(a,b,c*d)
orfma(c,d,a*b)
) and these give slightly different results. The loop bounds for computing and not computing eigenvectors for various loops are different, hence it is often the case that a loop with eigenvectors is vectorized, and without isn't, or for different loop iterations, and the vectorized use of FMA isn't identical to the unvectorized use of FMA. And with complex numbers of course there are even more permutations possible. Adding parentheses forces the compiler to use one variety.Applying this patch:
makes the test failures go to 0.
But it looks quite fragile, in that it takes trial and error to see which expressions need braces.
Would it be ok instead to replace test 5, e.g. https://github.com/Reference-LAPACK/lapack/blob/28f7e8309608b92aaec2e2556d4b25d758ccada9/TESTING/EIG/cdrvev.f#L799-L805 by a test that uses tolerances, like the one used below, or would that go against some specifications?
https://github.com/Reference-LAPACK/lapack/blob/28f7e8309608b92aaec2e2556d4b25d758ccada9/TESTING/EIG/cchkhs.f#L87-L87 https://github.com/Reference-LAPACK/lapack/blob/28f7e8309608b92aaec2e2556d4b25d758ccada9/TESTING/EIG/cchkhs.f#L825-L835
Checklist