breadboard-ai / breadboard

A library for prototyping generative AI applications.
Apache License 2.0
162 stars 22 forks source link

Explore node and graph equivalence #26

Open dglazkov opened 1 year ago

dglazkov commented 1 year ago

In terms of behavior, graphs are (or can be, with runOnce semantics) equivalent to nodes. What if they were?

OTOH, the graphs are a superset of nodes in terms of behavior -- for example, they may not have any--or multiple!--inputs and have none or multiple outputs.

dglazkov commented 1 year ago

@seefeldb ^^

seefeldb commented 1 year ago

by "multiple" you mean can be called / yield outputs multiple times at different times, right? (as opposed to having multiple kinds of wires going in and out, which is already supported, although through a single input and output node)

either way, i think nodes could have that as well -- e.g. a UI node that can represent the latest value and/or can repeatedly output the latest value the user set. so yes, they should be equivalent :)

seefeldb commented 1 year ago

just to bookmark: there's value in capturing the semantics of multiple input/output values in the graph, e.g. whether a value is "consumed" downstream: a text input widget that is wired into a graph that consumes the input should clear the input after submitting, like e.g. in a chat-like interface. but a text input wired into something that just reflects the last value would not, e.g. search results tied to a search box. the system can look at such a graph and notice that the wrong kinds are wired together.

dglazkov commented 1 year ago

What if every node was actually a graph. Borrowing from the cloud function exploration and how the API currently functions, what if every node followed this protocol:

sequenceDiagram
  participant C as Caller
  participant N as Node
  Note right of C: N = 1
  C ->> N: {}
  Note over C,N: Initial (empty) request
  activate N
  N ->> C: { type: "input", schema: { ... }, state: $stateN }
  loop Caller can stop at any time
    deactivate N
    Note over C,N: Get node's intermediate state and input schema
    C ->> N: { inputs: { ... }, state: $stateN }
    Note over C,N: Call with inputs and the intermediate state  
    activate N
    N ->> C: { type: "output", schema: { ... }, state: $stateN+1 }
    deactivate N
    Note over C,N: Get outputs, their schema, and next intermediate state
    C ->> N: { state: $stateN+1 }
    Note over C,N: If $stateN+1 is not empty, call again
  end
dglazkov commented 1 year ago

There's a shortcut a caller can take. If they just supply all inputs right away, it can be a one-turn call:

sequenceDiagram
  participant C as Caller
  participant N as Node
  C ->> N: { inputs: { ... } }
  Note over C,N: In a simpler flow, just supply inputs
  activate N
  N ->> C: { type: "output", schema: { ... } }
  deactivate N
  Note over C,N: And get outputs. No "state" returned
dglazkov commented 1 year ago

There's a shortcut a caller can take. If they just supply all inputs right away, it can be a one-turn call:

Of course, if this node actually contains interesting multi-turn interactions, this runOnce shortcut will not enjoy any of them.

seefeldb commented 1 year ago

The annotated schema: Is that how a REST client would use it? I'd expect them to most of the time assume the schemas. It would then look like a hybrid of the two: Directly supplying inputs, but possibly looping

seefeldb commented 1 year ago

And +1 to the general idea. Except for the dynamic schema declaration, this is already the case, no? Just with the state variable being an implicit protocol like in append.

dglazkov commented 1 year ago

I just realized this morning tha dynamic schema declaration can be viewed as a multi-turn flow. For example, here's what promptTemplate might be doing:

sequenceDiagram
    participant C as Caller
    participant N as promptTemplate
    C ->> N: { inputs: {} }
    Note over C,N: Caller initiates
    activate N
    N ->> C: { type: "input", schema: { "template" }, state: { "getTemplate" } }
    deactivate N
    Note over C,N: promptTemplate asks for "template"
    C ->> N: { inputs: { template }, state: { "getTemplate" } }
    Note over C,N: Caller supplies the template
    activate N
    N ->> C: { type: "input", schema: { ... }, state: { template } }
    deactivate N
    Note over C,N: promptTemplate asks for placeholders (template itself is state)
    C ->> N: { inputs: { ... }, state: { template } }
    Note over C,N: Call with placeholder input values   
    activate N
    N ->> C: { type: "output", schema: { prompt }, outputs: prompt }
    deactivate N
    Note over C,N: promptTemplate returns assembled prompt
dglazkov commented 1 year ago

For example, here's what promptTemplate might be doing

An interesting question is then: how might a node/graph be interrogated to produce a reasonable representation of its inputs and outputs?

A use case here would be a visual graph editor: how would it know what inputs/outputs to display?

In the case of promptTemplate, the final number of inputs is static, but depends on the first input (template). Without knowing how this node works on the inside, it's very hard to figure that out.

Is there a way to describe this without adding an extra "designView" representation?

dglazkov commented 1 year ago

It would then look like a hybrid of the two: Directly supplying inputs, but possibly looping

Yes! Which means that maybe we need to differentiate between a runOnce and run modes explicitly (so that the node/graph doesn't assume it can just keep grabbing inputs from the initial property bag)

dglazkov commented 1 year ago

💡 Sketch of a proposal:

seefeldb commented 1 year ago

What's the difference between run and runOnce at the node implementation level? A node that expects to be called again would always return state (unless it is done), right? So maybe there is no difference there? (I now understand that you mean run and runOnce as methods on Node, shared among all nodes, right?)

How would describe be implemented? Most nodes would export a static schema (can we generate that from TypeScript?), but some, like generateText would require the flow above that requires multiple calls. But when calling describe you don't want to accidentally run the whole node, right? Maybe there is a default describe implementation that can be overriden?

seefeldb commented 1 year ago

I just realized this morning tha dynamic schema declaration can be viewed as a multi-turn flow. For example, here's what promptTemplate might be doing:

+1, but per the previous comment we might want to differentiate a calling modality that just queries the node for its description.

We should also specify how state behaves here: I think it should not advance while looping over missing inputs. That is, we want to differentiate between two rounds in a conversation and asking for a missing value. If both look like requesting input these can't be differentiated. In one the state advances and in the other it doesn't. Hence:

Does that sound right?

FWIW, state isn't anything special here. We might want to standardize on $state or $self for legibility, but there might be other inputs that are expected on every call in a multi-turn call, e.g. the safety settings.

seefeldb commented 1 year ago

One more potential difference between the regular flow and describe: describe will also want to declare outputs, but when e.g. used in an editor might also be used to figure out the inputs as a function of outputs. passthrough would be an example of such a node. Maybe that's too esoteric for the first round, but something to keep in mind.

Indeed, this might map to a different kind of pattern: Generating an input based on other inputs and desired outputs. For example:

This relates to google/breadboard-ai#56 (keeping comments and extra data in the graph for graph generation purposes): Imagine a graph generation flow that creates a graph with inputs and outputs and task descriptions, but omits template and code for a second pass that then generates those.

To be clear, I don't think this automation should automatically happen due to a describe call. This is more an adjacent concept that might be useful in an editor (where we might want to pro-actively generate example template / code / etc based on inputs).

That said, a neat way to deal with this is to wrap the template/code generation in a graph and thus make this a pattern that can be included (so promptTemplate and runJavascript actually become includes that generate the body, and maybe also do caching and GAR from a library and all that stuff).

dglazkov commented 7 months ago

We're almost there!