Open Tara-Lakshmipathy opened 4 days ago
I think some aspect of this should be handled inside pyiron_workflow
and just linked here. E.g., I could imagine something like this belonging nicely in the backend module:
def to_macro(*nodes: StaticNode, class_name: str | None = None) -> type[Macro]:
"""
Given a set of sibling node instances, creates a new macro class where all
connections internal to the passed nodes are reformed in the graph creation
method, and all current values for external inputs (i.e. those which do not
have a connection to a sibling in `*nodes`) are used as defaults for the new
macro IO.
A class name can be provided explicitly, but otherwise will be constructed
using data from `*nodes`.
TODO: Maybe expose a kwarg for modifying the IO interface?
"""
That is pretty rough; I could alternatively imagine returning a string of the code defining the graph creation method (or both class and raw code). It is also incomplete as what you really want to be able to do is this and replacing *nodes
in the graph with an instance of the new macro on-the-fly. (Then there is the corollary step of taking a macro instance and on-the-fly replacing it with all its individual parts, i.e. "unpacking" the macro onto its parent level.)
Anyway, my main point is that it seems fair to make the backend responsible for most of this functionality.
That's definitely an idea I can get behind! Then from the GUI side, we would just have to pass the selected nodes to def_macro
.
By the way @liamhuber , is there a way for me to access information about other nodes from within a node before the workflow has run even once? For example, I want the label of a node that is connected to the current node to be used from within the current node.
Something like:
@as_function_node("something")
def CurrentNode(input1):
if (input1.original_owner.label) == 'InterestingNode':
variable1 = interesting_function()
else:
variable1 = boring
return variable1
I mean, of course this is possible if the user sets up the output in the previous node to return something which includes the original_owner
properties. But my question is, would it be a consideration for future development so that this is a default?
Or maybe something even more general and not restricted to the nodes that are connected? This was the angle I was going for when I mentioned the master data class thing in the specs.
You could define helper methods on nodes that are aware of its graph state (eg the wf.get_master_dataclass()
), that's not a problem at all -- but the actual node operation (.run()
) should not lean on graph construction this way.
Function nodes hold and process data, the node instance itself should not be used as data; similarly macros are just a graph of other nodes and so ultimately behave the same way. I'm sure you can strong-arm a set of nodes to go in this direction, but I strongly recommend against it. Having function node innards depend on parent graph state is highly non-functional and I think it will only bring headaches in the long run.
@liamhuber Oh, I definitely do not want to run nodes from within other nodes. Nor do I want nodes to have access to generated input values from other nodes without being connected (that's what connections are there for lol). I would only like information about other nodes in the graph, and the user-defined inputs (like the number of repetitions of a unit cell that a user defined somewhere else in the workflow).
The thing is, openBIS can store datasets with a graph representation where stuff are linked together:
I would like workflows to be represented on openBIS as close to 1:1 as possible. And following up on @JNmpi of having openBIS directly in the nodes, I would need some way for each node to know where it sits in the graph.
Do you mean OpenBis directly in the node, as in its run functionality exploits graph location info and talks to OpenBis, or on the node, as in there is some custom method like send_to_OpenBis
or locate in OpenBis or whatever? I am still very against the former, but the later sounds fine and should be quite easy.
Definitely the later!! I hadn't even considered the former XD
Ok good 😂 Without being very well informed about OpenBis, I can't make an absolute promise, but it certainly seems to me like it should be easy to access all the data it might want by adding a method to nodes. Even better might just be to write a function that takes a node (or nodes) as inputs, so we don't add any OpenBis dependency to the core infrastructure.
No, openBIS should definitely not be a dependency unless we include a bunch of other ones like CKAN, Zenodo etc (which I'm against including in the core infra). There are other candidates for publishing data, and I don't think we should make openBIS "the" knowledge-base for our datasets.