I am testing out a subset simulation workflow and having difficulties testing it locally due to the slowness of the jobscript-level element loop. To alleviate this, we should support (for the specific case of actions that simply run Python scripts) running the script for multiple elements within the same invocation.
When an action script is written (with script_data_in: direct and script_data_out: direct), boilerplate code like this is added to the script file that is generated within the run directory (sample_direct_MC.py is the name of the script in this particular case):
def sample_direct_MC(...):
...
return ...
if __name__ == "__main__":
# ...
# ... parse CLI arguments, import app, set config, etc.
# ...
wk_path, EAR_ID = args.wk_path, args.run_id
wk = app.Workflow(wk_path)
EAR = wk.get_EARs_from_IDs([EAR_ID])[0]
direct_ins = EAR.get_input_values_direct()
outputs = sample_direct_MC(**direct_ins)
outputs = {"outputs." + k: v for k, v in outputs.items()}
for name_i, out_i in outputs.items():
wk.set_parameter_value(param_id=EAR.data_idx[name_i], value=out_i)
To support this new feature, we would instead write something like:
def sample_direct_MC(...):
...
return ...
# ...
# ... parse CLI arguments, import app, set config, etc.
# ...
wk_path, EAR_IDs = args.wk_path, args.run_ids # note: plural
wk = app.Workflow(wk_path)
runs = wk.get_EARs_from_IDs(EAR_IDs)[0]
for run_i in runs:
direct_ins = run_i .get_input_values_direct()
outputs = sample_direct_MC(**direct_ins)
outputs = {"outputs." + k: v for k, v in outputs.items()}
for name_i, out_i in outputs.items():
wk.set_parameter_value(param_id=run_i.data_idx[name_i], value=out_i)
i.e. we introduce to the boilerplate a loop over runs, and call the script's main function within this loop.
Possible future work after initial implementation:
add support for different script_data_in/out formats (implement direct only initially)
using Python multiprocessing to optionally add some concurrency
add similar support for other languages (e.g. Julia).
I am testing out a subset simulation workflow and having difficulties testing it locally due to the slowness of the jobscript-level element loop. To alleviate this, we should support (for the specific case of actions that simply run Python scripts) running the script for multiple elements within the same invocation.
When an action script is written (with
script_data_in: direct
andscript_data_out: direct
), boilerplate code like this is added to the script file that is generated within the run directory (sample_direct_MC.py
is the name of the script in this particular case):To support this new feature, we would instead write something like:
i.e. we introduce to the boilerplate a loop over runs, and call the script's main function within this loop.
Possible future work after initial implementation:
script_data_in/out
formats (implementdirect
only initially)