SW4 (Seismic Waves, 4th order) implements substantial capabilities for 3-D seismic modeling, with a free surface condition on the top boundary, absorbing super-grid conditions on the far-field boundaries, and an arbitrary number of point force and/or point moment tensor source terms.
Other
129
stars
65
forks
source link
Restart issue: "readSACheader: ERROR" in Artie's h200 run #31
Artie ran a 4 node h200 case on Cori, interrupted with scancel after 1 hour, then got this error on restart.
Have not been able to reproduce it, but seems feasible if the checkpoint was written but some time series was not completely written.
Solution is to use the previous checkpoint, and make sure the time series has a backup file with ".bak" suffix, see 6f17b3. If restart is required, then user must verify:
That the last checkpoint and time series files were all successfully written and self-consistent. Things to check include file sizes, timestamps, and number of files.
Or if that isn't true, then use the previous checkpoint, and copy the .bak time series files to current file names
Then the files should we setup consistently for restart.
Artie ran a 4 node h200 case on Cori, interrupted with scancel after 1 hour, then got this error on restart.
Have not been able to reproduce it, but seems feasible if the checkpoint was written but some time series was not completely written.
Solution is to use the previous checkpoint, and make sure the time series has a backup file with ".bak" suffix, see 6f17b3. If restart is required, then user must verify:
Then the files should we setup consistently for restart.