ajbouh / substrate

8 stars 3 forks source link

bridge: documents prototype #22

Closed progrium closed 5 months ago

progrium commented 7 months ago

Documents are text buffers editable by users and assistants in realtime. They are accessible by URL for a stage or user to view and edit in a CodeMirror 6 editor. Yjs is used for realtime syncing in the browser with their reflection server. For now, the assistant view (page loaded in chromestage) will maintain a WebSocket connection that pushes the latest document state to the bridge server-side in-memory for reading, and server-side mutation operations will be pushed to the browser to be applied to CodeMirror and synced with other users.

Mutations map to CodeMirror change transactions, so the API is somwhat low level ({from: number, to⁠?: number, insert⁠?: string}) ... so unless the LLM is good enough to take a high level operation + document state and produce this transaction (maybe we can), we might only expect to be able to prepend/append data to the doc, but even that would convincing.

progrium commented 6 months ago

Even though a version of this could be done before/without stages, I would prioritize #10 over this.

ajbouh commented 6 months ago

I agree

ajbouh commented 6 months ago

10 is nearing completion, so some notes on this:

The simplest thing we could make would be something like this:

type ReadableTextBuffer interface {
  GetText() (string, error)
}

type WritableTextBuffer interface {
  SetText(t string) error
}

type TextBuffer interface {
  ReadableTextBuffer
  WritableTextBuffer
}

If it's helpful, there's some (~reusable) "Announcer" code in the cueloader package. It implements an http.Handler that announces the current value of a []byte. If the request indicates Accept: text/event-stream it will stream events. Otherwise it returns the latest value.

We could wire it up to file notifications to keep everything simple and compatible with things other services might want to do.

(This would not be a good experience if someone is making simultaneous edits. But it would be enough to get a feel for the interaction before we invest in more complex UX.)

This could be a very simple server that is parameterized by a space and a path. Something like:

https://substrate.home.arpa/textbuffer(d=sp-2303AIAC)/raw/foo/bar.md

This could support GET and PUT. It could also support PATCH (lol!).

We could implement a human-accessible edit UI for this as a textarea whose contents are replaced by events arriving on a text/event-stream. There might be a button in the UI that is somehow wired up to SetText().

https://substrate.home.arpa/textbuffer(d=sp-2303AIAC)/edit/foo/bar.md

The UI would be helpful for people to paste/add basic info.

We might expect the assistant backend to make the same HTTP requests as the UI, or to eval javascript on the UI itself using chromestage.

For viewing in chromestage (or the browser) we could serve a simple page that renders markdown at:

https://substrate.home.arpa/textbuffer(d=sp-2303AIAC)/view/foo/bar.md

This would also subscribe to the same event-stream and update the view as edits are made.

The beyond get/set, the next "methods" we might want are:

type Revision struct {
  ID string
  Parents []string // for now always a single item list
  Time time.Time
  Author string
}

type MultiRevisionTextBuffer interface {
  TextBuffer
  RestoreRevision(id string)
  GetRevisionTextBuffer(id string) ReadableTextBuffer
  RevisionLog(initialID string, limit int) []Revision
}

Under the hood this could be a bare git repo. In that case we might implement this as a single branch or ref. We'd need to set up file watchers. We'd need a file watcher for the git repo ref file and then re-announce when that ref changes.

ajbouh commented 6 months ago

Subsequent iterations on this concept could provide collaborative editing of the file using CodeMirror 6 and one of the collaborative editing plugins for it.

If the approach we start with uses file notifications and simple HTTP verbs, then we'd have a nice point of integration for these different services.

ajbouh commented 6 months ago

@progrium ^

ajbouh commented 6 months ago

Another thing to note is that a MultiRevisionTextBuffer is not strictly better than a TextBuffer. A TextBuffer provides trivial integration with many other software today, since it would operate directly on plain files. This means a user could also edit the file directly via the command line, via /debug/vscode, the jupyter notebook interface, or as a part of some other service interaction.

On the other hand some interactions might be better off keeping a continuous version log. It might be possible to create a sort of hybrid approach where we use a git repo to back a file and automatically make a commit on every call to SetText.

ajbouh commented 5 months ago

We have prototypes of all the pieces but I now need to decide how to connect the transcript, an assistant, and the document that it generates / edits.

There are some basic steps that have to happen:

  1. At some level we have to make a choice about when an assistant should generate outputs that are directed to a file.
  2. We need to allow the assistant to understand its basic instructions for creating the file

If the assistant can edit the file, we also need:

  1. To provide ongoing instructions for updates to the file
  2. To provide a recent snapshot of the transcript, if there is that it should be working from

The thing that makes this hard is integrating all of:

A good next step would be just a file editing experience that doesn't use a transcript.