plumed / plumed2

Development version of plumed 2
https://www.plumed.org
GNU Lesser General Public License v3.0
357 stars 284 forks source link

Test new latest intel compiler on GitHub Action #990

Closed carlocamilloni closed 10 months ago

carlocamilloni commented 10 months ago

This enables back the intel compilers testing and switches it to the new clang based version, cf. #976

carlocamilloni commented 10 months ago

@valsson @invemichele multiple VES and OPES regtest fail using the new intel compilers, could you please check if there is anything relevant?

carlocamilloni commented 10 months ago

@GiovanniBussi there are also tests that fail with a segfault like rt-make-threads and other that fails of +-nan, didn't we already solved the +- issue?

invemichele commented 10 months ago

I just checked, for OPES it fails every time I rely on NaNs, it seems they are interpreted as zeros. In particular std::isnan seems to always return false. Maybe it's related to this? https://stackoverflow.com/questions/32195949/why-does-nan-nan-0-0-with-the-intel-c-compiler https://stackoverflow.com/questions/47703436/isnan-does-not-work-correctly-with-ofast-flags

codecov-commenter commented 10 months ago

Codecov Report

Attention: 7 lines in your changes are missing coverage. Please review.

Comparison is base (a0973e3) 84.08% compared to head (647440c) 84.07%. Report is 1 commits behind head on master.

:exclamation: Current head 647440c differs from pull request most recent head 2e7b680. Consider uploading reports for the commit 2e7b680 to get more accurate results

Files Patch % Lines
src/tools/NeighborList.cpp 89.23% 7 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #990 +/- ## ========================================== - Coverage 84.08% 84.07% -0.02% ========================================== Files 602 602 Lines 56197 56228 +31 ========================================== + Hits 47255 47274 +19 - Misses 8942 8954 +12 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

GiovanniBussi commented 10 months ago

Interesting. I am afraid @invemichele identified the point with NaN, which might affect other tests. Regarding the rt-make-threads I will try to check locally (it seems we have icpx 2021 installed, hopefully it can reproduce the bug).

@carlocamilloni was the problem present also with (old) icpc? Or maybe the new suites does not include icpc, and we are forced to switch to icpx?

carlocamilloni commented 10 months ago

no we are not force to icpx yet but I think it is worth moving to it because icpc is deprecated. We can possibly overcome to nan issue by not using the default fast-math behaviour as mentioned in the post above, but at the same time I am not sure this is good because then people using the compiler will still get the error

Miniland1333 commented 10 months ago

Based on this stackoverflow article, perhaps check if the issue also goes away if you use -O2 instead of -O3. If this is the case, the stackoverflow article recommends perhaps using -fp-model=precise with -O3.

invemichele commented 10 months ago

On my side, using NaNs is a convenient way to do the parsing, but I could easily add a workaround for the case isnan(nan)==false so that the code is more robust

carlocamilloni commented 10 months ago

I think that generally speaking is not worth trading performances to parsing so it would be better to make the parsing more robust, so if a solution exist for the parsing I would say is better, @GiovanniBussi @maxbonomi @gtribello

GiovanniBussi commented 10 months ago

I agree with latest @carlocamilloni 's comment.

Regarding the problem on rt-make-threads, it's misleading now. Due to an incorrect coding (that I am going to fix), the code crashes. The underlying error is just numeric. I guess it will be sufficient to change a little bit the trajectory in the test to make it work with optimizing icpx

valsson commented 10 months ago

Hello,

From what I can see, there are two failures in the VES regtests:

Changing FORMAT_VALUES_DERIVS=%13.6f to FORMAT_VALUES_DERIVS=%13.5f would reduce the precision of the relevant files.

Will you do this change? Or should I do that via a PR on the master branch?

carlocamilloni commented 10 months ago

@valsson thanks, I have just pushed the change into this pull request

valsson commented 10 months ago

It seems that this commit has some unintended consequences on other files, so now it failing due to other issues. It is related to the bf*.targetdist-averages.data.reference. @carlocamilloni did you do that change on a Apple computer? I think it is related to issues with Apple clang compiler that I still needed to fix (see #950). I think that reverting all the bf*.targetdist-averages.data.reference files to the ones previous to this commit should fix the issue on GH action.

If you revert this commit, I can do a PR on the master branch where I only commit the relevant files related to FORMAT_VALUES_DERIVS=%13.5f

carlocamilloni commented 10 months ago

@valsson indeed, reverted, thanks

invemichele commented 10 months ago

I make the relevant changes here #992 Notice that there are other parts of the code that rely on std::isnan, e.g. https://github.com/plumed/plumed2/blob/25abfd876b6260465593b7c277d83e9e74cfc722/src/vatom/Center.cpp#L221 probably it's worth checking for possible issues not caught by the regtests

carlocamilloni commented 10 months ago

tests still failing: @invemichele @GiovanniBussi @valsson (this is to keep track of progresses)

invemichele commented 10 months ago

for my code I switched to a much simpler solution that does not use NaNs and works the same in practice, see #993

carlocamilloni commented 10 months ago

@valsson could you please decrease it by one more digit?