Closed pgrete closed 3 months ago
I don't understand how the CI passed with this bug...?
Also: would it make sense to add an action to run the CI with ASAN enabled to catch unitialized value bugs in general?
I don't understand how the CI passed with this bug...?
I think that hasGhost = 0
is the correct value for all of the restart tests, so if it was default initialized to zero then things would work even with the bug.
Also: would it make sense to add an action to run the CI with ASAN enabled to catch unitialized value bugs in general?
I agree that this would be a good idea.
PR Summary
hasGhost
was defined both inrestart.hpp
andrestart_hdf5.hpp
We only set the value inside the constructor of the derived class but in the parthenon manager we work with the base class. The latter actually set the data based on bounds defined byhasGhosts
so it made a decision based on uninitialized data. On Frontier I could reliably reproduce the issue (both with cray and amdclang/hip compilers), whereas on other machines/compilershasGhosts
was set by default/accident to0
in the base class (though also rendering reading restarts with ghosts unusable in current develop). Instead of just removing the duplicated declaration I decided to remove public variable entirely and go with a getter function. I also get the data type toint
for backwards compatibility and in case we eventually decide to do sth more with this rather than a bool check.PS: With this (hopefully) last PR, we can finally restart/do some analysis with AthenaPK in sync with Parthenon
develop
again.PR Checklist