Open eirrgang opened 5 years ago
We have resolved to distinguish between a graph edge that is an ensemble or array of operations, versus data the is a sequence or array.
We can convert between the two (if necessary) with map
and gather
, borrowing common meanings of such terms. Helper functions can automatically broadcast data when input data types are known, but we can also use implicitly generated broadcast operations and allow for explicit broadcast helper functions. reduce
will also fall into this set of data flow operations when it is explicitly represented at a higher level.
Further along in #190, we would expect to have something like hbond = gmx.tool.hbond(...)
(or gmxtool.hbond
, gmx.tool('hbond')()
or something), but in the simplest first round case, we wrap the command line, where hbond
is the first argument to the command with the gmx
executable.
In the simplest case:
hbond = gmx.commandline_operation('gmx',
arguments=['hbond'],
input={
'-f': 'somefile.trr',
'-s': 'input.tpr',
'-n': 'index.ndx'
},
output={
'-num': 'bynum.xvg',
'-ang': 'hbang.xvg'
})
For change #200, I expect to use the same implicit scatter
or map
idea as previously with from_tpr()
: an array or list value implicitly generates an array operation; to get the effect of broadcast
, a list of identical items is used. This will be refined in #203
It seems inelegant that filename options could be placed in either keyword_arguments
or input
/output
, and the idea of intelligently handling non-scalar values seems like unnecessary complexity. For this issue, #200 should remove keyword_arguments
. Users can manually append elements to arguments
to the same effect.
For the above examples to work, we should specify that arguments
are added to the command line immediately after the executable.
subtask of #190
gmx.command_line()
produces gmx.Operation objects that can be used in a work graph to invoke subprocesses.gmx.map()
generates appropriate graph topologies for e.g. ArrayOperation or ensemble simulation inputs. The user expresses CLI flags as a dictionary (collections.OrderedDict
) of key-value pairs. User must express execution order with the usual work graph dependency annotation.