Closed emitra17 closed 5 years ago
This has survived a run on ulk much longer than my attempt with the old version, but I can't really tell why. The log doesn't show any times that the new error catch saved the run.
The ulk run eventually died after about 365 replicates, apparently from garbage collection warnings. Confirmed that the garbage collection warning is not unique to this branch.
Although I haven't yet seen this save us in a non-synthetic situation in testing, I'm in favor of merging as is because it doesn't seem to break anything, and in principle it should help if some unexpected exception came up.
Latest attempt at a job submission workflow that can recover if the job throws an exception.
I am currently testing stability with a long-running bootstrap run on ulk.