pyiron / ironflow

Prototype of a graphical user interface for pyiron (unstable)
https://mybinder.org/v2/gh/pyiron/ironflow/HEAD?labpath=example.ipynb
BSD 3-Clause "New" or "Revised" License
16 stars 2 forks source link

Ironflow

Binder License Codacy Badge Coverage Status Documentation Status

Anaconda Last Updated Platform Downloads

Ironflow combines ryven, ipycanvas and ipywidgets to provide a Jupyter-based visual scripting gui for running pyiron workflow graphs. This project is under active development, and in particular the set of nodes available for the workflow graphs is still limited. If there is a particular use-case you'd like to see, or if one of our nodes is not working as expected, please raise an issue!

In its current form, ironflow has some UI performance issues when verifying the ontological status of ports. For smaller graphs, e.g. those in the examples, things should still feel quite snappy -- so if you notice serious performance issues please raise an issue! -- but for larger graphs, e.g. many 10s of nodes, you may notice some delay when generating the "recommended" nodes and ports on selection of an ontologically-typed port, and on updating the otype status on making new connections. This is a known issue, but not top-priority to fix; If you are using ironflow regularly and bumping into this problem a lot, please let us know in the issue and we'll increase its priority.

Usage

The main gui can be imported directly from ironflow.

The gui takes a session title at instantiation, and will automatically try to load any saved session (a JSON file) with the same name present. To visualize the gui, call the draw method. E.g.:

from ironflow import GUI
gui = GUI('example')
gui.draw()

The main screen for ironflow is used to build/run/save/load graphical pyiron workflows. In addition to manipulating the gui with buttons in the toolbar (hover the cursor over buttons for more info), you can:

In the default data execution mode (we don't currently do anything with the exec mode, so don't worry about it), nodes will update their output whenever their input data changes. You'll see the node body change color when it's performing this update. Some nodes have input (or output) ports that are of the execution rather than data type. These can be triggered by a signal from another node's exec-type output port, or by manually clicking the button associated with that port right there in the node widget.

In addition to the workflows screen, ironflow also incorporates the browser from pyiron_gui, as well as a log tab that allows you to turn the underlying ryven logger on/off and choose whether stdout gets routed to ironflow or its original context.

Two notes on the logger:

Differences to Ryven

Ironflow is built on top of ryvencore 0.3.1.1. There are a number of minor differences between ryven nodes and ironflow nodes discussed in the next section, but at a high level there are two significant differences:

Data typing

All node ports are typed, and connection perform type-checking to ensure validity prior to establishing a connection. By default, a special Untyped data type is used, which performs all validity checks by value, and thus does not allow pre-wiring of a graph without full data. Further, the validity of the current value for each IO port is indicated by the port color: green for valid, red for invalid.

You can read the full spec for the typing rules the ironflow.model.dtypes module, but at a high level each port has one or more classes whose instances are valid input. An output port can be connected to an input port as long as its valid classes are a strict subset of the input port's valid classes, and as long as the output port won't allow the the input port to be surprised by a None value.

This type checking is still under development and may be somewhat brittle. Our goal is to extend this system to be dynamically informed by an ontology on top of the graph: instead of statically insisting that input be of type float, we instead demand that the ontological type of the energy be surface energy dynamically because the output value of that port is used, e.g., to calculate a grain boundary interface energy.

Ontological typing

Nodes can also optionally carry an "ontological type" (otype). Leaning on the pyiron_ontology library for representing knowledge in computational workflows, otypes give a rich graph dependent representation of the data and facilitate guided workflow design. This is fully demonstrated in the bulk_modulus.ipynb and surface_energy.ipynb notebooks, but a quick demo is also provided in the video below.

We see that there is a "recommended" tab for nodes. After selecting this menu, clicking on the CalcMurnaghan.engine port populates the tab with nodes that have valid output for this port. We can double-click to place the new node (Lammps) and repeat the process, e.g. for the Lammps.structure input. Here we see there are two possibilities -- BulkStructure and SlabStructure -- and place both. (Note, as mentioned at the head of the readme, there is some lag in ironflow right now; you can see this in the delay between the double-click and the placement of these larger nodes.) Not only do we get recommendations for nodes to place in the graph, but we also get specific recommendations of which ports make valid connections! Below we again select the Lammps.structure input port, and see that the output ports on both the structure nodes is highlighted. Similarly, if we click the Lammps.engine output port, we see that all the valid input ports on our graph get highlighted; in this case, CalcMurnaghan.input. Finally, we see the real power of otypes -- by connecting the two engine ports, the Lammps node now has access to the ontological requirements of the CalcMurnaghan node! In particular, CalcMurnaghan produces bulk moduli and thus only works for calculations on bulk structures. After these are connected, when we once again select the Lammps.structure input, only the BulkStructure node gets highlighted, and only BulkStructure appears in the recommended nodes window.

ironflow_ontology.mov

Of course, not all ports in ironflow are otyped, and indeed not all should be -- e.g. it doesn't make sense to ontologically-type the output of the Linspace node, as it is just providing numbers which may be useful in many contexts. However, for nodes which specifically produce and require physically-/ontologically-meaningful data, otyping is a powerful tool for understanding workflows and guiding their design.

Batching

Many ports can be "batched" by selecting them to open the node controller window and pressing the "batched" button. This changes the expected input for the port from a single value to a list of values. The node operation is then iterated over the entire list, and output values are correspondingly also turned to a list.

You can quickly see which ports are batched in the graph because their labels are converted to ALL_CAPS while unbatched ports are all_lower_case.

Any number of input ports can be batched on the same node as long as all batches are of the same length.

Batching impacts the type checking in a (hopefully) intuitive way: a batched output port of type float can be fed to a batched input port of type float but not to an unbatched input port of type float. Similarly, an unbatched port of type list[float] can be passed to an input port of type float only if that port is batched. Only single values and 1D lists are supported right now, although support for higher order matrices of data is planned.

Adding custom nodes

The tools needed for extending your graphs with new custom nodes can be imported as from ironflow import node_tools. New nodes can be registered either from a list of nodes, or from a python module or .py file. In the latter two cases, only those nodes that inherit from Node and have a class name ending in _Node will be registered (this allows you to have your own node class templates and avoid loading the template itself by simply using regular python CamelCase naming conventions and avoiding ending in _Node).

A new node should have a title and may optionally have input and/or output channels specified. If you want your node to actually do something, you'll also need to define an update_event method. E.g.:

from ironflow.node_tools import Node, NodeInputBP, NodeOutputBP, dtypes, input_widgets

class My_Node(Node):
    title = "MyUserNode"
    init_inputs = [
        NodeInputBP(dtype=dtypes.Integer(default=1), label="foo")
    ]
    init_outputs = [
        NodeOutputBP(label="bar")
    ]
    color = 'cyan'

    def update_event(self, inp=-1):
        self.set_output_val(0, self.input(0) + 42)

gui.register_node(My_Node)

Ironflow nodes differ from standard ryven (version 0.3.1.1) nodes in five ways:

Otherwise, they are just standard ryven nodes, and all the ryven documentation applies.

Special nodes

We also have a number of special parent node classes available based of the meta-parent BatchingNode. Instead of specifying the update_event, children of BatchingNode specify other functions so that the update can be automatically batched over.

The simples of these is DataNode, for which children specify the node_function method, which must take arguments based on the labels of input ports and returns a dictionary with keys based on the labels of output ports. Nodes of this type attempt to update themselves on placement, and will automatically update or clear (set to None their output ports based on whether or not all of their input ports report valid input values.

The others are TakesJob and MakesJob, children of which must specify _modify_job or _generate_job methods, respectively. These nodes are designed to interact with pyiron's GenericJob objects in a functional way. They also support batching, and will automatically populate run and remove buttons on the node widget, and lock the input after their owned job(s) are run.

Structure

The code is broken into three main submodules:

The node_tools submodule is just a wrapper to expose other parts of the code base in one easy-to-import-from spot.

The model itself, HasSession, is just a driver for a single ryven Session, with some helpful tools like the ability to easily register new nodes.

The gui inherits from and drives the model, and is broken down into three screens: workflows (which allow you to manipulate the model), browser (which wraps the project browser from pyiron_gui), and a log. Inside the workflows screen, visual elements of the gui are broken down into subcomponents like the toolbar, a panel with a visual representation of the graph, a place to show the node representations, etc. We avoid listing them all here because what's included and how it's laid out is still in flux. The key conceptual bit is that these various sub-components do not rely directly on each other's internal implementation, they go through the workflow screen as an intermediary where necessary.