Closed ikalash closed 2 years ago
Yes, I recommend working on the nightly test directory to minimize the effort:
cd /home/lcm/LCM
source ./env-all.sh
cd "$LCM_DIR"
module purge
module load serial-clang-release
./clean-config-build.sh trilinos 72 -V && ./clean-config-build.sh lcm 72 -V
if you want a debug build just change this line:
module load serial-clang-debug
Curiously the issue does not show up in a debug build... unclear how to proceed. Perhaps one can try switching to an Ifpack2 preconditioner, since the FPE is in MueLu. I can't try this unfortunately since I don't have permissions to the /home/lcm clang release build from the nightlies.
Ok, so it looks like the initial nonlinear solves fail for awhile (maybe the time-step is too big initially), and for some reason, MueLu doesn't like this and barfs with clang. You can see the failed initial nonlinear solves in the gcc build, which runs: https://sems-cdash-son.sandia.gov/cdash/test/1878483 (||F|| stagnates). Switching to Ifpack2 circumvents the problem and leads to the tests passing. I'll check in the fix now and close this tomorrow if the case tests clean.
@ikalash Loads of fun! Thanks for digging into this!
No problem! This was way easier than issue #51 !
The ACE MiniErosion test with denudation began failing after Algol were upgraded to Fedora 35 when the code is built with a Clang compiler. FPEs are encountered in the test:
https://sems-cdash-son.sandia.gov/cdash/test/1879062
Curiously, the problem does not show up in the gcc build.
@lxmota : would you be able to provide instructions for how I can recreate the nightly clang build on algol to try to debug the problem in the serial run?