materialsproject / fireworks

The Fireworks Workflow Management Repo.
https://materialsproject.github.io/fireworks
Other
361 stars 185 forks source link

fix regression when rerunning one firework with dependencies #309

Closed gpetretto closed 6 years ago

gpetretto commented 6 years ago

I have realized that this commit https://github.com/materialsproject/fireworks/commit/d9a71ac27cfd025af39f507137c407657e17dd38 have caused a regression. Consider the case with a fw fw1 with one child fw2 and the whole wf is completed. Trying to rerun fw1 now results in having both fw1 and fw2 READY, instead of fw2 being WAITING.

Here is a piece of code demonstrating the problem

from fireworks import LaunchPad, Firework, Workflow
from fireworks.core.rocket_launcher import rapidfire
from fireworks.examples.custom_firetasks.hello_world.hello_world_task import HelloTask

lp = LaunchPad.auto_load()
#lp.reset()  

fw1 = Firework([HelloTask()])
fw2 = Firework([HelloTask()], parents=[fw1])
wf = Workflow([fw1, fw2])
old_new = lp.add_wf(wf)

fw_ids = list(old_new.values())

print("READY: {}".format(lp.get_fw_ids({"state": "READY", "fw_id": {"$in": fw_ids}})))
print("WAITING: {}".format(lp.get_fw_ids({"state": "WAITING", "fw_id": {"$in": fw_ids}})))

rapidfire(lp)

print("COMPLETED: {}".format(lp.get_fw_ids({"state": "COMPLETED", "fw_id": {"$in": fw_ids}})))

print("rerunning")
lp.rerun_fw(fw1.fw_id)

print("READY: {}".format(lp.get_fw_ids({"state": "READY", "fw_id": {"$in": fw_ids}})))
print("WAITING: {}".format(lp.get_fw_ids({"state": "WAITING", "fw_id": {"$in": fw_ids}})))

and its output

2018-10-19 15:30:58,521 INFO Added a workflow. id_map: {-2: 1, -1: 2}
READY: [2]
WAITING: [1]
2018-10-19 15:30:58,526 INFO Created new dir 
...
2018-10-19 15:30:58,578 INFO Rocket finished
COMPLETED: [1, 2]
rerunning
READY: [1, 2]
WAITING: []

The problem relies in the fact that now the parent state is not updated during the refreshof the child and since the refresh basically happens from the bottom the child still sees his parent as COMPLETED.

It is likely that the solution implemented is not the best but it does not impact on the improvements introduced in https://github.com/materialsproject/fireworks/commit/d9a71ac27cfd025af39f507137c407657e17dd38. An alternative would be to add a flag to refresh that allows to explicitly pull the current state of the parent fws if needed, like in the old implementation.

I also modified one of the tests to help preventing this kind of regression.

coveralls commented 6 years ago

Coverage Status

Coverage increased (+0.008%) to 60.499% when pulling 8aee33bf5512dd9a59dc4cadbed2cc31807e85c9 on gpetretto:bugfix into f55c2af89bae38005cc2a6c5d2371779a737f39e on materialsproject:master.

computron commented 6 years ago

Thanks!