Closed aquan9 closed 4 months ago
Should this be optional or automatic? Should there be a configuration option that only tries x number of times?
I wonder if checkpoint-restart would affect this? We might want to only resubmit tasks if the num_tries
in beeflow:CheckpointRequirement
allows it. I'm not sure if the number of times a task has already been restarted is stored in a database, so we might lose it on task manager failure.
I think this issue needs some discussion and clarification, maybe during a meeting.
Resolved with PR #827
Pieces broken up from #614