Closed lars2015 closed 10 months ago
The performance degradation was found to be related to an issue with the new version of the NVIDIA C compiler applying vectorization to a specific loop in module_mixing_help()
that was prevented in the previous version.
The problem has been fixed by manually disabling vectorization for the loop:
https://github.com/slcs-jsc/mptrac/blob/a50a0581cb1df2a3ab1f4c8559330b9418d248b9/src/trac.c#L2013
Describe the bug
MPTRAC nightly builds (https://datapub.fz-juelich.de/slcs/mptrac/nightly_builds/) show that the runtime of the physics timers on JUWELS Booster significantly increased on 3 Nov 2023, when the new stage 2024 was enabled. Switching back to stage 2023 on 10 Nov reproduced the original runtime:
Additionally, the new stage causes an issue in gpu_test sample.tab output:
To Reproduce
Rerun the test case at: /p/fastdata/slmet/slmet111/model_data/mptrac/nightly_builds/juwels-booster/run.sh
Expected behavior
Need to further investigate this issue and find the root cause.
Maybe next to updating the software stage also other changes of the JUWELS Booster system config were introduced?