materialsproject / fireworks

The Fireworks Workflow Management Repo.
https://materialsproject.github.io/fireworks
Other
361 stars 185 forks source link

Fix a bug where the timeout is not honored in `rlaunch multi` #419

Closed zhubonan closed 2 years ago

zhubonan commented 4 years ago

There was a bug where the timeout is not honoured in multi-launch.

The problem is because the new rapid-fire process will be launched if there are other processes running, despite there are no active or future jobs in the FireServer. However, the new rapid-fire should use an updated timeout based on elapsed time, rather than the original timeout value.

To avoid very short launches, I also made it such that the process will stand down if there is less than 3% of the original timeout available.

zhubonan commented 3 years ago

@utf Hi Alex, could you please take a look at this PR? Thanks!

utf commented 3 years ago

Looks good to me. Maybe @computron can confirm.

janosh commented 3 years ago

@computron This would be quite useful if merged, especially if the remaining time before stand down was made into a variable.