Open dglazkov opened 1 year ago
@seefeldb ^^
by "multiple" you mean can be called / yield outputs multiple times at different times, right? (as opposed to having multiple kinds of wires going in and out, which is already supported, although through a single input and output node)
either way, i think nodes could have that as well -- e.g. a UI node that can represent the latest value and/or can repeatedly output the latest value the user set. so yes, they should be equivalent :)
just to bookmark: there's value in capturing the semantics of multiple input/output values in the graph, e.g. whether a value is "consumed" downstream: a text input widget that is wired into a graph that consumes the input should clear the input after submitting, like e.g. in a chat-like interface. but a text input wired into something that just reflects the last value would not, e.g. search results tied to a search box. the system can look at such a graph and notice that the wrong kinds are wired together.
What if every node was actually a graph. Borrowing from the cloud function exploration and how the API currently functions, what if every node followed this protocol:
sequenceDiagram
participant C as Caller
participant N as Node
Note right of C: N = 1
C ->> N: {}
Note over C,N: Initial (empty) request
activate N
N ->> C: { type: "input", schema: { ... }, state: $stateN }
loop Caller can stop at any time
deactivate N
Note over C,N: Get node's intermediate state and input schema
C ->> N: { inputs: { ... }, state: $stateN }
Note over C,N: Call with inputs and the intermediate state
activate N
N ->> C: { type: "output", schema: { ... }, state: $stateN+1 }
deactivate N
Note over C,N: Get outputs, their schema, and next intermediate state
C ->> N: { state: $stateN+1 }
Note over C,N: If $stateN+1 is not empty, call again
end
There's a shortcut a caller can take. If they just supply all inputs right away, it can be a one-turn call:
sequenceDiagram
participant C as Caller
participant N as Node
C ->> N: { inputs: { ... } }
Note over C,N: In a simpler flow, just supply inputs
activate N
N ->> C: { type: "output", schema: { ... } }
deactivate N
Note over C,N: And get outputs. No "state" returned
There's a shortcut a caller can take. If they just supply all inputs right away, it can be a one-turn call:
Of course, if this node actually contains interesting multi-turn interactions, this runOnce
shortcut will not enjoy any of them.
The annotated schema: Is that how a REST client would use it? I'd expect them to most of the time assume the schemas. It would then look like a hybrid of the two: Directly supplying inputs, but possibly looping
And +1 to the general idea. Except for the dynamic schema declaration, this is already the case, no? Just with the state variable being an implicit protocol like in append
.
I just realized this morning tha dynamic schema declaration can be viewed as a multi-turn flow. For example, here's what promptTemplate
might be doing:
sequenceDiagram
participant C as Caller
participant N as promptTemplate
C ->> N: { inputs: {} }
Note over C,N: Caller initiates
activate N
N ->> C: { type: "input", schema: { "template" }, state: { "getTemplate" } }
deactivate N
Note over C,N: promptTemplate asks for "template"
C ->> N: { inputs: { template }, state: { "getTemplate" } }
Note over C,N: Caller supplies the template
activate N
N ->> C: { type: "input", schema: { ... }, state: { template } }
deactivate N
Note over C,N: promptTemplate asks for placeholders (template itself is state)
C ->> N: { inputs: { ... }, state: { template } }
Note over C,N: Call with placeholder input values
activate N
N ->> C: { type: "output", schema: { prompt }, outputs: prompt }
deactivate N
Note over C,N: promptTemplate returns assembled prompt
For example, here's what
promptTemplate
might be doing
An interesting question is then: how might a node/graph be interrogated to produce a reasonable representation of its inputs and outputs?
A use case here would be a visual graph editor: how would it know what inputs/outputs to display?
In the case of promptTemplate
, the final number of inputs is static, but depends on the first input (template
). Without knowing how this node works on the inside, it's very hard to figure that out.
Is there a way to describe this without adding an extra "designView" representation?
It would then look like a hybrid of the two: Directly supplying inputs, but possibly looping
Yes! Which means that maybe we need to differentiate between a runOnce
and run
modes explicitly (so that the node/graph doesn't assume it can just keep grabbing inputs from the initial property bag)
💡 Sketch of a proposal:
Node
, in addition to wire
, has three extra methods: run
, runOnce
, and describe
.run
and runOnce
are roughly equivalent to their Board
cousinsdescribe
method returns schemas of inputs and outputs of a Node, including some way to express dynamic declarationsBoard
also has describe
method that roughly does the same.What's the difference between run
and runOnce
at the node implementation level? A node that expects to be called again would always return state (unless it is done), right? So maybe there is no difference there? (I now understand that you mean run
and runOnce
as methods on Node
, shared among all nodes, right?)
How would describe
be implemented? Most nodes would export a static schema (can we generate that from TypeScript?), but some, like generateText
would require the flow above that requires multiple calls. But when calling describe
you don't want to accidentally run the whole node, right? Maybe there is a default describe
implementation that can be overriden?
I just realized this morning tha dynamic schema declaration can be viewed as a multi-turn flow. For example, here's what
promptTemplate
might be doing:
+1, but per the previous comment we might want to differentiate a calling modality that just queries the node for its description.
We should also specify how state behaves here: I think it should not advance while looping over missing inputs. That is, we want to differentiate between two rounds in a conversation and asking for a missing value. If both look like requesting input these can't be differentiated. In one the state advances and in the other it doesn't. Hence:
describe
mode the state is just context and won't be returned.Does that sound right?
FWIW, state isn't anything special here. We might want to standardize on $state
or $self
for legibility, but there might be other inputs that are expected on every call in a multi-turn call, e.g. the safety settings.
One more potential difference between the regular flow and describe
: describe
will also want to declare outputs, but when e.g. used in an editor might also be used to figure out the inputs as a function of outputs. passthrough
would be an example of such a node. Maybe that's too esoteric for the first round, but something to keep in mind.
Indeed, this might map to a different kind of pattern: Generating an input based on other inputs and desired outputs. For example:
template
is missing, but the other inputs are present)code
input).This relates to google/breadboard-ai#56 (keeping comments and extra data in the graph for graph generation purposes): Imagine a graph generation flow that creates a graph with inputs and outputs and task descriptions, but omits template
and code
for a second pass that then generates those.
To be clear, I don't think this automation should automatically happen due to a describe
call. This is more an adjacent concept that might be useful in an editor (where we might want to pro-actively generate example template / code / etc based on inputs).
That said, a neat way to deal with this is to wrap the template/code generation in a graph and thus make this a pattern that can be included (so promptTemplate
and runJavascript
actually become include
s that generate the body, and maybe also do caching and GAR from a library and all that stuff).
We're almost there!
In terms of behavior, graphs are (or can be, with
runOnce
semantics) equivalent to nodes. What if they were?OTOH, the graphs are a superset of nodes in terms of behavior -- for example, they may not have any--or multiple!--inputs and have none or multiple outputs.