Open jlewi opened 1 month ago
Here's my Runme doc looking at some sample data. It looks like in the case of a non-interactive cell the stdout is available in one of the cell output items. For interactive cells it doesn't look like the cell output is available in either of the two output items.
It looks like the mime types of the output items are
"type": "stateful.runme/output-items",
"output": {
"content": "VGhpcyBjZWxsIGlzIGV4ZWN1dGVkIGluIG5vbiBpbnRlcmFjdGl2ZSBtb2RlIFdlZCBPY3QgIDkgMjA6Mjg6NTUgUERUIDIwMjQNCg==",
"mime": "text/plain",
"id": "01J9T7H2DRRW89QJMQNV0SS3YM"
}
It looks like the mime types of the output items are
@sourishkrout
For non-interactive cells does it matter which output item we use to get the output from? i.e. whether we use the item with mime type application/vnd.code.notebook.stdout or the mime type stateful.runme/output-items? How would stderr end up being encoded?
For interactive cells how difficult would it be to fetch stdout and stderr and include that in the requests that get sent to Foyle?
I dug into the RunMe code to try to figure out what's going on.
Output types are defined here https://github.com/stateful/vscode-runme/blob/3e36b16e3c41ad0fa38f0197f1713135e5edb27b/src/constants.ts#L6
There doesn't appear to be one for stderr.
It looks like the function generateOutputUnsafe is responsible for generating the CellOutputItems when a cell is executed.
It looks like it goes down the code branch for case OutputType.terminal
regardless of whether its interactive or non-interactive terminal.
It looks like this is the line where it tries to generate a serialized representation of the terminal contents. https://github.com/stateful/vscode-runme/blob/3e36b16e3c41ad0fa38f0197f1713135e5edb27b/src/extension/cell.ts#L239
It looks like in the case of non-interactive terminal terminalState is an instance of LocalBufferTermState and serialize returns the actual stdout.
For interactive terminal it looks like terminalState is an instance of XtermState and Serialize returns an empty string. It looks like this is using xterm-addon-serialize to get the serialized data.
Interestingly, if I save the session outputs, the outputs of interactive cells is saved in the markdown file. Does that use a different code path?
It looks like in all code paths for terminal it uses "NotebookCellOutputItem.stdout" https://github.com/stateful/vscode-runme/blob/3e36b16e3c41ad0fa38f0197f1713135e5edb27b/src/extension/cell.ts#L292 to create an output item containing stdout https://github.com/microsoft/vscode/blob/f5946cee2e01f1702426efd954ecc7250d8e5075/src/vs/workbench/api/common/extHostTypes.ts#L3818
So it looks like we can use the output item with mime type application/vnd.code.notebook.stdout to get the stdout.
On the Foyle side; we convert from Cell protos to Block protos here https://github.com/jlewi/foyle/blob/5eafef9471846d647d6544b7d79a752223ba9c51/app/pkg/runme/converters/cells_to_blocks.go#L102
This should preserve the mime type and data contents. It looks like if there are multiple output items all of them should be included when the block is converted to markdown. https://github.com/jlewi/foyle/blob/5eafef9471846d647d6544b7d79a752223ba9c51/app/pkg/docs/converters.go#L34
So my suspicion is that the prompt does get included in non-interactive cells but not interactive cells but most cells are interactive as that appears to be the default.
So I tested it with a non-interactive cell. The command in the cell was
echo "The current time is $(date)."
If you execute it and then generate a completion the markdown that gets sent to the LLM is
echo "The current time is $(date)."
{
"type": "stateful.runme/output-items",
"output": {
"content": "VGhlIGN1cnJlbnQgdGltZSBpcyBUaHUgT2N0IDEwIDE2OjQxOjQ3IFBEVCAyMDI0Lg0K",
"mime": "text/plain",
"id": "01J9WCSV2CAQVERX6JT7T8DZ5G"
}
}
The current time is Thu Oct 10 16:41:47 PDT 2024.
So the output is included. Although it looks like we might want to filter out the mime type stateful.runme/output-items.
I suspect there is a bug in how I'm serializing the notebook in the vscode-runme frontend before sending it over the wire to Foyle. The code does this by calling marshalNotebook
https://github.com/stateful/vscode-runme/blob/3e36b16e3c41ad0fa38f0197f1713135e5edb27b/src/extension/ai/generate.ts#L46 https://github.com/stateful/vscode-runme/blob/3e36b16e3c41ad0fa38f0197f1713135e5edb27b/src/extension/ai/ghost.ts#L111
@sourishkrout Am I missing a call to addExecInfo before serializing the notebook? It seems like that function might be synchronizing information from the terminal to the notebookData structure?
This likely relates to how the notebook state is snapshotted in the AI parts of the extension. I'll look into the details next week. We've recently done some work to "centralize" this so consumers other than the actual notebook serialization can use it. However, I have to make sure it supports the use case.
For non-interactive cells, this is also associated with https://github.com/jlewi/foyle/issues/143.
Questions
@sourishkrout
For non-interactive cells does it matter which output item we use to get the output from? i.e. whether we use the item with mime type application/vnd.code.notebook.stdout or the mime type stateful.runme/output-items? How would stderr end up being encoded?
For interactive cells how difficult would it be to fetch stdout and stderr and include that in the requests that get sent to Foyle?
So for non-interactive cells, please use application/vnd.code.notebook.stdout
which is the VS Code API default for stdout/stderr. The reason we have stateful.runme/output-items
and some form of duplication is that VS Code's notebook APIs tie web-components tasked with rendering the UI to a media-type and to have the custom UI/UX such as action buttons (Copy, Save, etc) available we are using stateful.runme/output-items
. While stateful.runme/output-items
is subject to UI-driven changes, application/vnd.code.notebook.stdout
will stay stable and only have the output.
Re, stderr
in non-interactive, I've not been happy with how we're handling stdout vs stderr. I have a refactor on my list that will only send stdout
into the notebook (primarily because third-party rich renderers are stdio-unaware) and, unless otherwise configured on the cell or doc, will omit stderr
in the notebook-integrated terminal. However, with the same refactor, we'll introduce a multiplexed panel-based terminal that will receive both stdout and stderr. The idea is to enable the UI/UX to highlight the panel-based terminal for non-zero exits, either auto-focusing the terminal or providing a button to call the user's attention to the error.
I am continuing with interactive exec in a separate comment/analysis.
Interactive Cells
It looks like the mime types of the output items are
application/vnd.code.notebook.stdout
- But there is no actual data
stateful.runme/terminal
- Data is a base64 encoded json dictionary that doesn't have stdout or stderr
Similarly for interactive execution, stateful.runme/terminal
renders a web-component with an instance of xterm.js
that's configured according to the payload. Studio is bidirectionally streamed directly between kernel <> terminal web-component because the notebook APIs are not capable of terminal emulation, and replacing outputs too frequently causes side-effects that make the notebook unusable (flickering, excessive resource consumption, etc).
So the question here's: why is application/vnd.code.notebook.stdout
empty?
Shouldn't this go through serialization before logging it, @jlewi? Please see GrpcSerializer.marshalNotebook
calls in both generate.ts
or ghost.ts
.
There might also be a secondary issue that Auto-save (on)
has to be active because otherwise, the terminal output will not be processed. The UI/UX state is awkwardly coupled here, which wasn't/isn't a problem for what outputs have been used for non-AI. We can fix that separately.
@sourishkrout Am I missing a call to addExecInfo before serializing the notebook? It seems like that function might be synchronizing information from the terminal to the notebookData structure?
Update: Yes, you're right. I also totally overlooked that marshalNotebook
is called and not serializeNotebook,
which includes the logic to fill the terminal buffer. We could attempt moving this business logic up the call chain.
For a different use case, I see the team's used a cache, which we could generalize, but I worry about synchronization issues. The cache might be behind by a version since it's built for saving, not editing.
As a work around, I modified Foyle to return all code cells as non-interactive cells. This works in terms of the output being included in the subsequent prompts. However, non-interactive cells don't show stderr which make them unusable in many instances.
Here's a notebook visualizing the LLM requests for a lot of examples. https://gist.github.com/jlewi/b9247059ebb3cb323eee10504ffc9f6c
It doesn't look like the cell output of previous cells is actually part of the LLM request. Rather, what we have is some RunMe metadata.