Closed bindgens1 closed 3 years ago
The Intel compiler is supported and we last ran CI tests with version 2018 Update 4 about one week ago. I've just triggered a build on the current master to make sure that it still works: https://gitlab.icp.uni-stuttgart.de/espressomd/espresso/-/jobs/135095.
Which version are you using? Did you enable any non-standard optimizations? What CPU are you running on?
I didn't specify any special compiler flags.
vsc32262@login1:espresso$ icpc -V
Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 18.0.1.163 Build 20171018
Copyright (C) 1985-2017 Intel Corporation. All rights reserved.
vsc32262@login1:espresso$ icc -V
Intel(R) C Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 18.0.1.163 Build 20171018
Copyright (C) 1985-2017 Intel Corporation. All rights reserved.
vsc32262@login1:build$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 36
On-line CPU(s) list: 0-35
Thread(s) per core: 1
Core(s) per socket: 18
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
Stepping: 4
CPU MHz: 2300.000
BogoMIPS: 4600.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 25344K
NUMA node0 CPU(s): 0-17
NUMA node1 CPU(s): 18-35
Intel turns on fast math automatically, which relaxes the floating point rounding rules. This is probably why the unit test fails.
The failures didn't occur two weeks ago. That's why I was a bit concerned.
Apparently your CI doesn't have a problem with the Intel Compiler. Feel free to ignore the issue if you don't consider it a general problem.
Thanks!
Could you try disabling fast-math by adding -fp-model=strict
to your CMAKE_CXX_FLAGS
? I don't think we've ever tested Espresso with the Intel compiler on a Skylake processor, so it's entirely possible that you found a bug here that should be fixed.
Test suite passes without issue.
Thanks!
Could you please re-open this issue? We should still fix it on the Espresso side. Preferably not by adding that flag, but by actually fixing either the test or the arithmetic.
Sure!
I just thought it might not be needed if it's only effecting me, as we found a way to circumvent the failing tests.
I might add: Turning fast math on with the GNU compiler, let's the DPD test fail. Nothing else.
Shouldn't be too difficult to fix. Please change the issue title to "fast math breaks some tests" or something like that and I might have a look during the next coding day.
The errors in rotational_inertia
look large for rounding issues, maybe there the test isn't ideal, in the sense that errors are adding up. In general it's not clear to me how you can say what the expected precision is if the rounding rules are relaxed. If we want to support that someone should look up what fast math exactly means, and how much the results depend on hardware.
Maybe have a look at the Consistency of Floating-Point Results using the Intel® Compiler, section Bottom Line, which provides Linux/macOS/Windows compiler flags to get consistent results across different CPU architecture. That paragraph is copy-pasted from the PDF in the page footer, which contains more details on the approximations made in fast math mode.
There are a few places in the core, unit tests and python tests where floating point comparison is carried out with extremely small tolerances. These operations break on fast-math mode (tested with GCC and Clang) and could potentially break on old 32-bit architectures, e.g. the i586 arch which is currently part of the python3-espressomd
package CI pipeline on OpenSUSE.
I have two failing tests on our cluster with the current development branch:
1)
espresso/src/core/unit_tests/grid_test.cpp(79)
espresso/src/core/unit_tests/grid_test.cpp(82)
Fails with:
error: in "get_mi_coord_test": absolute value of std::abs(get_mi_coord(a, b, box_l, true) - (a - b) - box_l){4.4408920985006262e-16} exceeds 2.2204460492503131e-16
error: in "get_mi_coord_test": absolute value of std::abs(get_mi_coord(b, a, box_l, true) - (b - a) + box_l){4.4408920985006262e-16} exceeds 2.2204460492503131e-16
This seems to be the same issue twice
2)
I compiled Espresso with the intel compiler. Are these known issues that occured before? In case the intel compiler is no longer supported, I will try again with gcc the next days.