OrderN / CONQUEST-release

Full public release of large scale and linear scaling DFT code CONQUEST
http://www.order-n.org/
MIT License
96 stars 25 forks source link

Segmentation fault - invalid memory reference at pzhegvx call #67

Closed lionelalexandre closed 3 years ago

lionelalexandre commented 3 years ago

I get a segmentation fault error message when trying to execute Conquest (develop branch) on Silicon and other systems which seems to originate from pzhegvx of scalapack. I looked up to the arrays and dimensions of pzhegvx input variables, it seems to be ok; gfortran compiler has been used.

davidbowler commented 3 years ago

Can you supply input files please?

lionelalexandre commented 3 years ago

Attached the input files Si_test.zip

davidbowler commented 3 years ago

OK - I have no problems running this with the version:

Git Branch: develop; tag, hash: v1.0.2-pre-30-g9f3c7e79

(which is the latest version). Can you please set:

IO.Iprint 3 IO.WriteOutToFile F

and run the code, redirecting output to a file (e.g. using mpirun -np 1 Conquest | tee output_crash) and send me the output? Please also tell me the version of Conquest, the compiler you're using, and which Scalapack version.

lionelalexandre commented 3 years ago

I do suspect as well that the problem is related to Scalapack (v2.1.0), although I have recompiled the library using the same version of gfortran ; I get the same issue with an older version of Conquest. Attached 2 output files from the current develop and the older versions. I'll continue to dig in... Si_test.zip

lionelalexandre commented 3 years ago

Just to let know that I've recompiled Scalapack and Conquest using gfortran/mpich (gcc7) and I get the same error.

lionelalexandre commented 3 years ago

The problem is solved by using the internal BLAS library of LAPACK compiled under the default name "refblas". Note that any attempt to execute Conquest compiled with the netlib BLAS libraries failed, leading to the "Segmentation fault" when pzhegvx is called.