parthenon-hpc-lab / parthenon

Parthenon AMR infrastructure
https://parthenon-hpc-lab.github.io/parthenon/
Other
109 stars 33 forks source link

Check that `ParameterInput` is the same on every rank #802

Open Yurlungur opened 1 year ago

Yurlungur commented 1 year ago

@lroberts36 discovered that it's possible for ParameterInput to contain different information per MPI rank. This will cause parallel HDF5 to hang as it prevents the write for the ParameterInput string from being collective.

In the Parthenon call we agreed that it's fine to disallow having an inconsistency between ParameterInput across ranks. However, we should check for this. The suggested solution is to do an all-to-all check that hashes of the string object are equal.

jvr0123 commented 4 months ago

I'm interested in taking this if it's not stale

Yurlungur commented 4 months ago

I don't think this is stale. Your help would be welcome! Let us know if you need some help getting started.