Keep comments and other extra data in graphs

seefeldb commented 1 year ago

It might be useful to save some extra data in serialized graphs, such as what the goal of the graph is, and where it isn't obvious from the name, explain roles specific nodes have, why certain wires exist or what data they carry.

This is likely going to be quite useful when learning from graphs, e.g. to automatically modify or create graphs. But it might already be helpful when searching for a graph in an index. And also when rendering a graph.

Maybe as analogy: LLM code creation would almost certainly be worse if the training data had no comments at all (I'm not aware of anyone benchmarking this, though, TBH).

The runtime would ignore these bits, but we could change the breadboard code to make it easy to add comments into the graph instead of as // or /* .. */ comments in the code. E.g. kit.node().role("role of this node").wire(... or kit.node().wire("...", otherNode, "reason for this wire") or maybe just allow foo->bar // comment in the wire spec.

Then add some graph level data like the purpose of the graph, a few examples of how a user might ask for this task, etc.

@dglazkov WDYT?

dglazkov commented 1 year ago

I like it. Maybe just as special $role and $description properties in the config property bag?

const node = board.node({ $role: 'role of this node' });

dglazkov commented 1 year ago

The wire comments idea is really interesting. I really like that.

seefeldb commented 1 year ago

Another thing to consider might be marking up groups of nodes and wires as together doing a particular job, e.g. "create a semantic index for the document". When rendering in Mermaid we could group those nodes!

seefeldb commented 1 year ago

Oh! This might guide default IDs for the nodes. Which currently is a bit of a wasted signal for the LLM.

Or we could even imagine a quick pass through an LLM to generate nicer node names based on the roles? Though that would make most sense if a human verifies those.

More food for thought...

seefeldb commented 1 year ago

See https://github.com/google/breadboard-ai/issues/26 for another reason why adding a role would be useful.

Since you proposed adding it to the property bag by default: Should we just do that?

breadboard-ai / breadboard

Keep comments and other extra data in graphs #56