DCPROGS / HJCFIT

Full maximum likelihood fitting of a mechanism directly to the entire sequence of open and shut times, with exact missed events correction.
GNU General Public License v3.0
9 stars 4 forks source link

Jobs to resume execution when re-run #103

Open raquelalegre opened 8 years ago

raquelalegre commented 8 years ago

To launch a job in archer you have to calculate beforehand how long it can take.

You shouldn't overestimate the time it takes to run just to be safe, because the scheduler's algorithm for queued jobs will cause a long wait until the job is run (generally the longer the wall time is, the longer the wait).

On the other hand, if you underestimate the time, the job will be stopped before it converges - you can't see the outputs and will have to rerun the job, wait the queue, etc.

We can modify the code so that next time a stopped job is run, it resumes execution in the iteration where it left it. This is not in the proposal but it was discussed in last meeting it'd be nice to have.