Closed ninjapapa closed 5 years ago
In old "output" module paradigm, modules were not "run" in the sense that the DF is recomputed but re-run to publish the output. Now that "output" modules are pure publish, we should re-run them every time. As a user, that is what I would expect as there is no way to compare the publish result to the current output to determine if we need to re-publish
@AliTajeldin make sense.
Put the real write operation into a "post_run" method. Later figured out that can just put it in _post_action
method, since it always after run method, and will always be called even ephemeral.
However with more thought, it is not ideal:
Will do the following
Current entry point to module from the running is _get_data
. Need to make that entry point _do_it
. Then _do_it
can call _get_data
for regular modules, for output module, _do_it
just call doRun
directly, and still put the write operation in doRun
.
Within output's _do_it
, will call _run_ancestor_and_me_postAction
since the write operation guarantees an action.
Currently output module re-run the same way as other modules, so basically if there is a persisted data with the same hash-of-hash, it will NOT rerun. However, it may not be the desired behavior since user may somehow deleted the output file/table, and expect rerun the output module will recover it.