flame / libflame

High-performance object-based library for DLA computations
Other
235 stars 83 forks source link

Segmentation fault - invalid memory reference when running Netlib LAPACK 3.9.0 xlintsts #46

Open akesandgren opened 3 years ago

akesandgren commented 3 years ago

Hi!

I know that this is somewhat one the wrong side of the bleeding edge for libFLAME, but in the process of investigating BLIS+libFLAME as the main blas/lapack libs for EasyBuilds foss toolchain I got hit by this.

Building libFLAME through EasyBuild using the gobff/2020b toolchain. I.e., GCC is 10.2.0

Building Netlib lapack tmglib (and ref lapack lib) and testing files with: -O0 -frecursive -std=legacy -mieee-fp -fno-trapping-math -fno-math-errno -march=native (so as to not introduce errors form the compiler in the test suite)

Linking with -lrefblas -lflame -lreflapack (since libflame doesn't contain all functions needed from lapack 3.9.0 yet) Running xlintsts I get:

easybuild-kvm [TESTING]$ ./LIN/xlintsts < stest.in
 Tests of the REAL LAPACK routines 
 LAPACK VERSION 3.5.0

 The following parameter values will be used:
    M   :       0     1     2     3     5    10    50
    N   :       0     1     2     3     5    10    50
    NRHS:       1     2    15
    NB  :       1     3     3     3    20
    NX  :       1     0     5     9     1
    RANK:      30    50    90

 Routines pass computational tests if test ratio is less than   30.00

 Relative machine underflow is taken to be    0.117549E-37
 Relative machine overflow  is taken to be    0.340282E+39
 Relative machine precision is taken to be    0.596046E-07

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x7f0da67ec20f in ???
#1  0x420137 in ???
#2  0x7f0da6ed8a94 in ???
#3  0x7f0da6eb4d57 in ???
#4  0x4b3248 in ???
#5  0x42ac5c in ???
#6  0x422ab2 in ???
#7  0x4267b7 in ???
#8  0x7f0da67cd0b2 in ???
#9  0x40620d in ???
#10  0xffffffffffffffff in ???
Segmentation fault
fgvanzee commented 3 years ago

@akesandgren Thanks for this report, but it's not really clear to me how to go about trying to reproduce. Could you provide some details in that regard?

akesandgren commented 3 years ago

The easy way is,

git clone https://github.com/akesandgren/lapack.git
cd lapack
git checkout v3.9.0-blis-test
cp make.inc.blis-test make.inc
# Change BLASLIB to be the librefblas
make -j blaslib lib
cd TESTING/LIN
make -j single
cd ..
./LIN/xlintsts < stest.in

This has so far failed on Broadwell, SkylakeX, AMD EPYC