pyiron / pyiron_workflow

Graph-and-node based workflows
BSD 3-Clause "New" or "Revised" License
11 stars 1 forks source link

Next steps #68

Open JNmpi opened 10 months ago

JNmpi commented 10 months ago

At our workflow hackathons at the ADIS meeting, we had some interesting discussions including a wish list from the user perspective. I am also thinking a lot about features that we need in the pyiron_workflow module to realize a syntax and functionality that is close to that of pyiron_base and pyiron_atomistic, but that contains all the nice architectural features that we get with the node-based approach. Below is a likely rather incomplete list of features, modules, etc. that in my opinion are crucial. It would be helpful if we could extend and prioritize this list. Also, it would be helpful if volunteers adopt the specifications and developments of these features.

As overarching guiding principles of the required functionality we should test and benchmark against pyiron_base/atomistics and ironflow. The new pyiron_workflow class should provide an easy approach to realize the same functionality. As an example see the https://github.com/pyiron/pyiron_workflow/tree/JNmpi_lammps_nodes branch where I tried to construct lammps-based workflows that closely mimic our pyiron syntax (see e.g. the jupyter notebook in this branch: pyiron_like_workflows.ipynb).

### Tasks
liamhuber commented 10 months ago

Use ironflow's graphical user interface and connect it to the workflow model. @samwaseda mentioned that he would be interested in looking into this development.

Super! I guess this will be a test of how well structured the ironflow code is 😅 I am sure we can adapt it to move the backend from ryven to pyiron_workflow while keeping (most of) the front end, but time will tell how easy that is. I'm happy to also be involved in this.

liamhuber commented 10 months ago

Provide functionality present in ironflow: e.g. batch (ideally in connection with executors) and show to access and process node data

So this largely already exists in pyiron_workflow in the for-loop meta node! It even makes the looped/batched IO ALL_CAPS like ironflow does.

On the GUI side I am OK with having a "batch" button that replaces a node with a for-loop-wrapper of that node, and we can absolutely think about some nice syntactic sugar to make the transformation easier in the code-based interface as well. However, Ironflow actually embeds the batching inside the generic node, and I really can't recommend this any more; it creates unnecessary complication, had to be done one dimension at a time, and made the type checking hell. So under the hood I'd like to maintain the current paradigm of the for-meta-loop creating a bunch of node replicas.

In this context we will want to think about some syntactic shortcuts for the for-loop to apply an executor to each of its children instead of to itself, but that should be easy.

liamhuber commented 10 months ago

Provide drawing tools to show the provenance of an executed workflow, i.e., similar to the graphs produced by AIIDA

I guess this is closely related to the node_identifier used in the Creator being something more sophisticated than a string for the module import. The current infrastructure was built with such an extension in mind, even though what's actually used is still very simple. We should pull @srmnitc in on this task.

liamhuber commented 10 months ago

Provide easy access to executors and efficient and automatic distribution of the node-generated tasks

Accessing a standard python single-core parallel process is currently as easy as node.executor = True. These can't be nested.

Adding syntactic sugar to distribute executors to children, e.g. wf.nodes.executor = SOMETHING setting it for all children or what have you, is an ok idea and should be easy.

Looping in more sophisticated executors is critical -- pympipool is still pending this pr.

We should also pull in @pmrv with the idea of setting a tinybase.Submitter instead of an executor, so that multiple different nodes can pile onto the same remote resources.