PCMSolver / pcmsolver

An API for the Polarizable Continuum Model
http://pcmsolver.readthedocs.io/
GNU Lesser General Public License v3.0
32 stars 21 forks source link

Green spherical diffuse test broken with Intel compilers #159

Open robertodr opened 6 years ago

robertodr commented 6 years ago

The green-spherical-diffuse test is broken when building with the Intel compilers.

Current Behavior

I've compiled on stallo for these tests using the following module/environment set ups. For Intel 2017:

module load CMake/3.9.1
module load Python/2.7.14-intel-2017b
export BOOST_INCLUDEDIR=/home/roberto/Software/boost/include
export BOOST_LIBRARYDIR=/home/roberto/Software/boost/lib

for Intel 2018:

module load CMake/3.9.1
module load Python/3.6.4-intel-2018a
export BOOST_INCLUDEDIR=/home/roberto/Software/boost/include
export BOOST_LIBRARYDIR=/home/roberto/Software/boost/lib

I have tried Debug and Release modes, to no avail. In release mode, but only for Intel 2018, I have tweaked the optimization flags from -O3 all the way down to -O1, again to no avail. I suspected an unsafe floating-point optimization but passing -fp-model precise did not help.

robertodr commented 6 years ago

This could be related to the ODE integrator. See also #138

hajgato commented 5 years ago

Actually green-spherical-diffuse breaks with GCC 7.3.0 and using avx or newer instruction set. (tested on haswell cpu with -march=native and -march=native -mno-avx flags.

boegel commented 5 years ago

@robertodr Any updates on this?

I'm seeing the same failing test when building with Intel compilers, or when building with GCC and using -march=native on recent Intel systems (Intel Haswell & Intel Skylake X); without using -march=native and using GCC 7.3 the tests are passing, but not when using Intel compilers...

robertodr commented 5 years ago

I accidentally closed and reopened this issue and then forgot to post an actual reply. Apologies for that. There are no updates on this, the issue goes in a bit deeper than I had originally expected and the solution I tried caused other breakages.

boegel commented 5 years ago

@robertodr Can you clarify if this is just an issue with the test itself, or does the test actually signal a problem with the installation that shouldn't be ignored?

I'm wondering if it's worth considering to simply ignore the broken test and install PCMSolver anyway...

Are you aware of a workaround when using the Intel compilers, e.g. by disabling a particular optimization that causes the issue?

rolfheil commented 4 years ago

I could not compile this file using Intel 2019 on Saga and it's possibly related. The error I get is

[ 16%] Building CXX object src/CMakeFiles/pcm-objlib.dir/green/SphericalDiffuse.cpp.o In file included from /cluster/software/Boost/1.71.0-iimpi-2019b/include/boost/math/special_functions/detail/igamma_inverse.hpp(16), from /cluster/software/Boost/1.71.0-iimpi-2019b/include/boost/math/special_functions/gamma.hpp(2127), from /cluster/software/Boost/1.71.0-iimpi-2019b/include/boost/math/special_functions/factorials.hpp(14), from /cluster/software/Boost/1.71.0-iimpi-2019b/include/boost/math/special_functions/binomial.hpp(14), from /cluster/software/Boost/1.71.0-iimpi-2019b/include/boost/numeric/odeint/stepper/bulirsch_stoer_dense_out.hpp(31), from /cluster/software/Boost/1.71.0-iimpi-2019b/include/boost/numeric/odeint.hpp(45), from /cluster/home/rolfheil/progs/pcmsolver/src/green/InterfacesImpl.hpp(38), from /cluster/home/rolfheil/progs/pcmsolver/src/green/SphericalDiffuse.hpp(42), from /cluster/home/rolfheil/progs/pcmsolver/src/green/SphericalDiffuse.cpp(24): /cluster/software/Boost/1.71.0-iimpi-2019b/include/boost/math/tools/roots.hpp(810): error: "auto" function requires a trailing return type auto quadratic_roots(T const& a, T const& b, T const& c) ^

compilation aborted for /cluster/home/rolfheil/progs/pcmsolver/src/green/SphericalDiffuse.cpp (code 2) make[2]: [src/CMakeFiles/pcm-objlib.dir/green/SphericalDiffuse.cpp.o] Error 2 make[1]: [src/CMakeFiles/pcm-objlib.dir/all] Error 2 make: *** [all] Error 2

Trailing return types are allowed in the the C++-14 standard, but not in the C++-11 standard. Looking in the boost header file, there appears to be a check for C++ standard, but only for C++-17 for some reason. Is the C++ standard set for PCMSolver?

akesandgren commented 4 years ago

I've tested with intel 2019.5.281 with various args down to -O0 -g and it fails green_spherical_diffuse all the time. This is with PCMSolver 1.2.3

robertodr commented 4 years ago

Since this issue was opened, I haven't had time to look into what the exact source of the problem is. It is unlikely I'll be able to debug any time soon. If you're using PCMSolver in a quantum chemistry program, I suggest you mark the spherically symmetric diffuse interface functionality as broken to your users. FYI @ilfreddy

On Tue, Aug 18, 2020 at 4:47 PM Åke Sandgren notifications@github.com wrote:

I've tested with intel 2019.5.281 with various args down to -O0 -g and it fails green_spherical_diffuse all the time. This is with PCMSolver 1.2.3

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/PCMSolver/pcmsolver/issues/159#issuecomment-675523530, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4JOEIUXDIQDJPWVCJSJHLSBKH6LANCNFSM4E4QCWMA .

-- Roberto Di Remigio

sassy-crick commented 2 years ago

As some time has passed I was wondering if there is an update of that? I seemed to be able to build it with march=native -mno-avx -mno-avx2 on an Intel Gold 5118 CPU but I cannot test it on other CPUs. If that works reliable, a quick-fix could be to compile only this part of the code with the flags mentioned above, and the rest of the code in a more traditional, optimised way.

sassy-crick commented 1 year ago

Further my last comment: I have installed version 1.3.0 on AMD EPYC 7742 64-Core Processor with both Intel 2021.4.0 and GCC-11.2.0 and that test is still failing.

In light of this is dragging on for nearly 4 years now, may I suggest to either fix this ongoing issue or remove the test? I would prefer to get it fixed as that is the correct solution, assuming the test itself is ok. If the problem are the avx or avx2 instruction sets, it would make sense to compile that part of the code differently, as mentioned above. Thanks.