Open HanatoK opened 1 month ago
Reporting here a Slack message from Dave Hardy:
We've had some past issues with over-optimization by Intel's compilers leading to errors in NAMD. Eric Bohm did some testing and determined that this particular issue is resolved by the Intel 2023 compilers.
Do they have MKL available alongside that compiler on Frontera?
Do they have MKL available alongside that compiler on Frontera?
Yes, but I later found the problem was not due to the eigendecomposition. Simple code as follows without calling any other 3rd libraries calculating the correlation matrix can go wrong:
struct Coordinate {
double x;
double y;
double z;
};
using AtomGroup = std::vector<Coordinate>;
void build_correlation_matrix(
const AtomGroup& ag, const AtomGroup& ag_ref, double out[3][3]) {
double mat_R[3][3];
for (size_t i = 0; i < 3; ++i) {
for (size_t j = 0; j < 3; ++j) {
mat_R[i][j] = 0;
}
}
for (size_t i = 0; i < ag.size(); ++i) {
mat_R[0][0] += ag[i].x * ag_ref[i].x;
mat_R[0][1] += ag[i].x * ag_ref[i].y;
mat_R[0][2] += ag[i].x * ag_ref[i].z;
mat_R[1][0] += ag[i].y * ag_ref[i].x;
mat_R[1][1] += ag[i].y * ag_ref[i].y;
mat_R[1][2] += ag[i].y * ag_ref[i].z;
mat_R[2][0] += ag[i].z * ag_ref[i].x;
mat_R[2][1] += ag[i].z * ag_ref[i].y;
mat_R[2][2] += ag[i].z * ag_ref[i].z;
}
// print_matrix<3, 3>(mat_R);
for (size_t i = 0; i < 3; ++i) {
for (size_t j = 0; j < 3; ++j) {
out[i][j] = mat_R[i][j];
}
}
}
In my opinion the Intel/2019 (19.1.1) compiler is just too dangerous to use.
Yes, but I later found the problem was not due to the eigendecomposition. Simple code as follows without calling any other 3rd libraries calculating the correlation matrix can go wrong:
Unbelievable. This code could not be any simpler.
In my opinion the Intel/2019 (19.1.1) compiler is just too dangerous to use.
It definitely looks that way. Thank you for checking!
The test code can be found in https://github.com/HanatoK/Intel_Compiler_2019_bug_test, which should use the same algorithm as Colvars. However, even if I use Eigen3 or
math_eigen_impl.h
, the calculations are consistently wrong for Intel compiler 2019.This issue happens on Frontera with the default Intel compiler, the version of which shows
This issue affects the calculation of orientation and Euler angles the most, and RMSD seems to be less affected.