Fix Checkpoint/Restart - Githubissues

Steven's fixes for checkpoint/restart, addressing issue #750.

After multiple restarts of a task that exceeds num_tries, the workflow should be put in a Failed state. I'm not sure if this should be the Archived state instead, which I think happens with other workflows?

The checkpoint-restart tests can be run locally, after starting beeflow, with ./ci/integration_test.py -t checkpoint_restart_failure,checkpoint_restart --timeout 10000 (this might take >10 minutes).

lanl / BEE

Fix Checkpoint/Restart #776