Code seems to fail catastrophically when using a restart file with an executable compiled to break the domain into a different number of segments than restart-producing executable.
An example of this behavior can be seen when restarting from 0.9375 mad 256x128x128 restarting at restart_00000142.h5. Original code was compiled with NxCPU = 2,8,8; when run on executable with NxCPU = 1,4,4, divbmax increases to dramatically after a few M and the code ultimately fails several M after that with ctop error.
The following is a diff between restarting mid-MAD with 1 and 16 processes (lower right is divB, which does not affect the flow). I think I can finally safely close this.
Code seems to fail catastrophically when using a restart file with an executable compiled to break the domain into a different number of segments than restart-producing executable.
An example of this behavior can be seen when restarting from
0.9375 mad 256x128x128
restarting atrestart_00000142.h5
. Original code was compiled with NxCPU = 2,8,8; when run on executable with NxCPU = 1,4,4, divbmax increases to dramatically after a few M and the code ultimately fails several M after that with ctop error.