Reference-LAPACK / lapack

LAPACK development repository
Other
1.52k stars 441 forks source link

Some tests fail for NVIDIA HPC SDK 20.7, 20.9, 20.11 #430

Open wyphan opened 4 years ago

wyphan commented 4 years ago

HI,

I just tried compiling reference LAPACK 3.9.0 using the newly released NVIDIA HPC SDK 20.7 on an AMD Zen2 processor (Ryzen 5 3600X). I noticed that some of the tests failed:

            -->   LAPACK TESTING SUMMARY  <--
        Processing LAPACK Testing output found in the TESTING directory
SUMMARY                 nb test run     numerical error     other error  
================    =========== =================   ================  
REAL                1107279     285 (0.026%)    0   (0.000%)    
DOUBLE PRECISION    1221707     280 (0.023%)    0   (0.000%)    
COMPLEX             641118      23  (0.004%)    0   (0.000%)    
COMPLEX16           684278      140 (0.020%)    0   (0.000%)    

--> ALL PRECISIONS  3654382     728 (0.020%)    0   (0.000%)    
Testing REAL              Singular-Value-Decomposition-ssvd.out
  SBD drivers:     56 out of  14820 tests failed to pass the threshold
  SBD drivers:     56 out of  14820 tests failed to pass the threshold
  SBD drivers:     56 out of  14820 tests failed to pass the threshold
  SBD drivers:     56 out of  14820 tests failed to pass the threshold
  SBD drivers:     56 out of  14820 tests failed to pass the threshold
 passed: 51300
failing to pass the threshold: 280
Testing REAL              Linear-Equation-routines-stest.out
  SLS drivers:      4 out of 105840 tests failed to pass the threshold
 passed: 299334
failing to pass the threshold: 4
Testing REAL              RFP-linear-equation-routines-stest_rfp.out
   STFSM auxiliary routine:     1 out of  7776 tests failed to pass the threshold
 passed: 5352
failing to pass the threshold: 1
Testing DOUBLE PRECISION Singular-Value-Decomposition-dsvd.out
  DBD drivers:     56 out of  14820 tests failed to pass the threshold
  DBD drivers:     56 out of  14820 tests failed to pass the threshold
  DBD drivers:     56 out of  14820 tests failed to pass the threshold
  DBD drivers:     56 out of  14820 tests failed to pass the threshold
  DBD drivers:     56 out of  14820 tests failed to pass the threshold
 passed: 51300
failing to pass the threshold: 280
Testing COMPLEX           Linear-Equation-routines-ctest.out
  CPB:     11 out of   3458 tests failed to pass the threshold
  CPB drivers:      4 out of   4750 tests failed to pass the threshold
  CLS drivers:      8 out of 105840 tests failed to pass the threshold
 passed: 304541
failing to pass the threshold: 23
Testing COMPLEX16          Singular-Value-Decomposition-zsvd.out
  ZBD drivers:     28 out of  14340 tests failed to pass the threshold
  ZBD drivers:     28 out of  14340 tests failed to pass the threshold
  ZBD drivers:     28 out of  14340 tests failed to pass the threshold
  ZBD drivers:     28 out of  14340 tests failed to pass the threshold
  ZBD drivers:     28 out of  14340 tests failed to pass the threshold
 passed: 20425
failing to pass the threshold: 140

Attached is the full testing log: testing_results.txt

Edit: added processor name

wyphan commented 4 years ago

Also, here is the make.inc that I used to compile. I roughly followed the steps listed in this page to build a shared library version of LAPACK by modifying Makefile and SRC/Makefile, but I think these modifications should be unrelated to the testing failures.

martin-frbg commented 4 years ago

If nvfortran is in any way related to recent flang you could check if adding -Kieee to FFLAGS helps (And with the AMD AOCC flavor of flang, I found it necessary to add -fno-unroll-loops so this could be another option to try and narrow it down)

wyphan commented 4 years ago

@martin-frbg I think it is more related to the PGI compiler than AOCC flang (Actually, the pgfortran alias is still there and now points to nvfortran), but I'll give it a try once I get back to my Zen2 workstation.

Edit: the -Kieee flag does the job! Now it's down to only 5 numerical errors:

            -->   LAPACK TESTING SUMMARY  <--
        Processing LAPACK Testing output found in the TESTING directory
SUMMARY                 nb test run     numerical error     other error  
================    =========== =================   ================  
REAL                1300419     1   (0.000%)    0   (0.000%)    
DOUBLE PRECISION    1302223     4   (0.000%)    4   (0.000%)    
COMPLEX             768366      0   (0.000%)    0   (0.000%)    
COMPLEX16           769178      0   (0.000%)    0   (0.000%)    

--> ALL PRECISIONS  4140186     5   (0.000%)    4   (0.000%)
Testing REAL              RFP-linear-equation-routines-stest_rfp.out
   STFSM auxiliary routine:     1 out of  7776 tests failed to pass the threshold
 passed: 5352
failing to pass the threshold: 1
Testing DOUBLE PRECISION Nonsymmetric-Eigenvalue-ded.out
  DDRVES: DGEES1 returned INFO=     6.
  DDRVES: DGEES1 returned INFO=     6.
  DES:    2 out of  3264 tests failed to pass the threshold
  DGET24: DGEESX1 returned INFO=     6.
  DGET24: DGEESX1 returned INFO=     6.
  DSX:    2 out of  3494 tests failed to pass the threshold
 passed: 6198
failing to pass the threshold: 4
Info Error: 4
wyphan commented 3 years ago

Update: Building with NVIDIA HPC SDK version 20.9 and 20.11 also results in some errors. As suggested by @martin-frbg (at least for building OpenBLAS with PGI compilers / NVIDIA HPC SDK), building reference LAPACK also requires the -Kieee compiler flag. Attached is the make.inc file (renamed to make.inc.nv.txt) that I use for reference LAPACK, and the three full build logs (compressed as gzip files), each with NVIDIA HPC SDK 20.7, 20.9, and 20.11, respectively.

The command that I use to build is

$ make clean
$ make -j 12 blas_testing lapack_testing > build-nv20.11.log 2>&1

make.inc.nv.txt

build-nv20.7.log.gz build-nv20.9.log.gz build-nv20.11.log.gz