Closed santhnm2 closed 4 years ago
Probably want a corresponding
_open_trace
method as well.I think we also need to record some other state: such as job completion times. Might make sense to try to checkpoint all the main state (like priorities, job_completion_times, etc). Not sure though
The trace can be passed in to the run_scheduler_with_trace.py
script - though we would need to merge it with any jobs that have yet to be dispatched. Wrt the other state, I think the only thing we care about is the completion times of the jobs that have already finished, which we can get from the log before failure.
Probably want a corresponding
_open_trace
method as well.I think we also need to record some other state: such as job completion times. Might make sense to try to checkpoint all the main state (like priorities, job_completion_times, etc). Not sure though