Open mcw92 opened 1 year ago
This is needed for checkpoints for long running individuals (exceeding a singular job run)
Add run config to the checkpoint and validate it on resume. For now a run with inconsistent migration topology or number/distribution of workers should probably warn and crash. In the future it might set up a fresh checkpoint as to preserve and separate the different runs.
Implement HDF5 checkpointing instead of pickles for better interoperability.