OpenMathLib / OpenBLAS

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
http://www.openblas.net
BSD 3-Clause "New" or "Revised" License
6.17k stars 1.47k forks source link

Regression with current pre-0.2.9 git and the elk code #329

Closed martin-frbg closed 10 years ago

martin-frbg commented 10 years ago

Just a quick heads up - I will try to pinpoint the problem if possible later: Using the ELK "computational chemistry" code from elk.sourceforge.net I see lots of failures in the test problems distributed with the code when I build it against the current git version of openBLAS optimized for either Haswell or Sandybridge. Using 0.2.8 built for Sandybridge, all is well on this i7-4770 (though openblas gives no measurable speedup on these small problems). To reproduce:

  1. get elk-2.2.10.tgz from sourceforge
  2. unpack it and run the included "setup" script to create make.inc
  3. in make.inc, specify the openblas library instead of the included lapack&blas
  4. run make, and then "make test"
  5. With 0.2.8, all 17 tests pass within a total runtime of around 155 seconds, with 0.2.9, runtime is more than doubled (due to the test cases failing to converge) and most tests report failure. One hint: in test-002, valgrind emits lots of "use of uninitialized value" warnings in calls to dgemv_t, dgemm_kernel, zscal_k etc.
xianyi commented 10 years ago

@wernsaar , could you look at this issue? Thank you

wernsaar commented 10 years ago

On 15.12.2013 23:36, martin-frbg wrote:

Just a quick heads up - I will try to pinpoint the problem if possible later: Using the ELK "computational chemistry" code from elk.sourceforge.net I see lots of failures in the test problems distributed with the code when I build it against the current git version of openBLAS optimized for either Haswell or Sandybridge. Using 0.2.8 built for Sandybridge, all is well on this i7-4770 (though openblas gives no measurable speedup on these small problems). To reproduce:

  1. get elk-2.2.10.tgz from sourceforge
  2. unpack it and run the included "setup" script to create make.inc
  3. in make.inc, specify the openblas library instead of the included lapack&blas
  4. run make, and then "make test"
  5. With 0.2.8, all 17 tests pass within a total runtime of around 155 seconds, with 0.2.9, runtime is more than doubled (due to the test cases failing to converge) and most tests report failure. One hint: in test-002, valgrind emits lots of "use of uninitialized value" warnings in calls to dgemv_t, dgemm_kernel, zscal_k etc.

Reply to this email directly or view it on GitHub: https://github.com/xianyi/OpenBLAS/issues/329 Hi,

if you want to build for haswell, piledriver or bulldozer, you need recent versions for gcc and binutils, and yuo need valgrind-3.9.0 or newer

Werner

martin-frbg commented 10 years ago

Build system in question is opensuse 12.3, so fairly recent (binutils-2.23, gcc472, using gcc482 does not solve the problem). Will do further analysis with valgrind 3.9 instead of the 3.8.1 used for the above. Please note that a sandybridge build of 0.2.8 works without errors, while 0.2.9 sandybridge is unusable on the same system.

martin-frbg commented 10 years ago

Just for the record, updating binutils to 2.24 did not change anything. (Neither did updating valgrind change anything fundamental about the slew of warnings - but I do have to concede that it generates a similar (high) number of complaints for 0.2.8 although that one manages to yield the correct results).

martin-frbg commented 10 years ago

Finally got around to taking another look - it turned out the problem with 0.2.9-rc1 is specific to openmp: When Elk is compiled without the "-fopenmp" from its default make.inc settings, all its tests pass on Haswell with 0.2.9-rc1. Conversely, a -fopenmp build linked against 0.2.9-rc1 fails even on nehalem architecture, where 0.2.8 works well (provided that it was built with USE_THREAD=0, USE_OPENMP=1)

martin-frbg commented 10 years ago

The problem apparently was introduced well before the Haswell branch was merged. Bisecting now.

martin-frbg commented 10 years ago

dfd1064d7be6b5c43759c38cab79f094a453e906 is the first bad commit commit dfd1064d7be6b5c43759c38cab79f094a453e906 Author: Zhang Xianyi traits.zhang@gmail.com Date: Sat Nov 2 15:09:33 2013 +0800

refs #287. Don't enable OpenMP for netlib LAPACK sequential Fortran codes.
martin-frbg commented 10 years ago

Have confirmed now that removing the distinction between F(P)FLAGS and LAPACK_F(P)FLAGS introduced by the above change to Makefile.system fixes my problem also in current git head.

xianyi commented 10 years ago

@martin-frbg , Thank you for the investigation.

I added dfd1064 to fix the SEGFAULT with OpenMP on Windows.

martin-frbg commented 10 years ago

Yes, I saw that but it was not clear to me if that was a real fix, and not just papering over a different problem. If "SEGFAULT on Windows" trumps "wrong result on (at least) Linux", can the change be made "#ifdef WINDOWS" please ?

xianyi commented 10 years ago

Please try develop branch. Thank you again.

martin-frbg commented 10 years ago

Thank you. (Might it make sense to revisit #287 now that 0.2.9 contains a newer LAPACK ?)

martin-frbg commented 10 years ago

Will see if I can get openblas&elk built on a windows/mingw system in the near future for additional insight.