hpcflow / matflow-new

Mozilla Public License 2.0
3 stars 3 forks source link

Write MatFlow 'wiring' guide for contributors #222

Open gcapes opened 10 months ago

gcapes commented 10 months ago

It wasn't clear to me how data is passed around MatFlow, and which schema inputs translate to which action script/function inputs. A brief overview would probably help contributors/users to implement a new schema.

gcapes commented 10 months ago

I've made a start on this below, but I'm sure @aplowman you can expand and clarify some of this. It would probably be good to have an explanation for each of the keywords that can be used in a task schema.

Task schema inputs

Note that parameters listed as inputs in a task schema are the inputs required to complete that 'step', i.e. the objective, and do not necessarily map exactly onto the inputs passed to the script being called in the actions. There are a couple of different options for how the inputs are passed - controlled by the value ofscript_data_in. A value of direct passes the variables as you would expect i.e. if you define e.g. a: 23 in your workflow.yaml file, the script will receive a value of 23 for the parameter a.

Sometimes you might want to save (some/all of) the schema input variables in a file, and use that file as an input argument to your script. e.g. sample texture from ctf file - a json file and hdf5 path are the input parameters to the script, but the schema has multiple inputs that are saved into the json file. This file is automatically named by MatFlow and looks something like js_0_act_0_inputs.json - this is then detected by MatFlow as the input to the script when you use the variable inputs_JSON_path.

A similar process exists for outputs (script_data_out), and other file formats