Open jlewi opened 5 months ago
Currently, RunMe serializes/deserializes notebooks inside the GoLang server. I believe there was an experiment to run serialization in the browser using WASM. However, it looks like the WASM code path is being removed (stateful/vscode-runme#349).
WASM pre-dated the gRPC (de-)serializer. While it worked great in the VS Code desktop app we ran into runaway memory issues in remote execution environments. This was likely a side-effect from cross-compiling the Runme CLI to WASM instead of breaking out the Serializer into a discrete library. In any case, WASM was abandoned because the Runme kernel's role expanded and needed to live much closer to the system (io, tty, etc).
Known Blockers
There are two main blockers I'm aware of to making RunMe compatible with vscode for web
- gRPC
- Moving serialization into the browser
While not impossible from a pure technological point of view, the VS Code platform has some firm boundaries here. Many of the APIs (e.g. terminal, code lenses, file system operations, etc) as well as "work out of the box" like "product experience" are provided by VS Code's Extension Host APIs. These are vastly limited for what VS Code calls "Web Extensions" which basically limits them to either read-only rendering or something purely WASM-based: https://code.visualstudio.com/api/extension-guides/web-extensions & https://github.com/microsoft/vscode-extension-samples (search for directories containing wasm
).
While one might think VS Code is "just a webapp", it's really a IDE micro-services architecture and operating outside of architectural boundaries comes with VS Code Platform incompatibility as well as delivery/packing tradeoffs.
Here's a diagram illustrating how I would like to deploy RunMe.
One light-lift option for a serverless deployment could be VS Code's tunneling capabilities. And, code-server
(by Coder) or the Theia project might have some more flexible answers which I haven't fully explored.
E.g. The GHA we've built that let's you drop into a tunneled VS Code like a "breakpoint" in workflow: https://github.com/stateful/vscode-server-action/blob/main/src/main.ts#L45-L51
I believe MSFT's licensing restricts the tunnels for non-commercial use but otherwise should allow for open/public delivery.
Not saying we don't want "serverless" ourselves, however, just trying to offer alternatives that are available "now".
VPC perimeters and bastion nodes
Enterprise infrastructure usually lives within a secure VPC. This means to run the steps in a playbook (e.g. kubectl, gcloud, awscli commands etc...), you usually have to tunnel into a machine within the perimeter.
With the proposed architecture users could open up RunMe notebooks in their browser and then execute those commands inside a machine inside the VPC via RunMe's GoLang server. The notebooks would be stored locally.
Reproducible / Containerized Environments
A major headache today with operations is that each developer has to install and configure all the tools used in playbooks. The above architecture means this can be replaced with containerized environments provisioned with role accounts. All that needs to be installed on the client is a browser and network access.
Remote Debugging
When troubleshooting problems with VMs or Containers its often necessary to execute commands within those containers or environments. The above architecture would mean we could start the RunMe server inside the target (E.g. as a K8s ephemeral container) and then execute parts of our playbooks inside that machine.
Runme by extension VS Code's Remote Development capabilities supports these scenarios in multiple ways. They are well maintained by MSFT and Docker and have whether we "believe" it or not come with credibility of Microsoft' & Docker's brands.
Attach VS Code to Bastion via SSH: https://docs.runme.dev/how-runme-works/runme-via-ssh#how-to-set-up-ssh-connection-in-vs-code No change in the Runme notebook UX changes, except that now the host system is your jumphost.
First-class Devcontainer Support: https://docs.runme.dev/guide/devcontainer I call this "opscontainer" but the idea is the same as SSH. Instead of the bastion host, you run against a locally hosted container.
While we are open to improve the engineer's experience we're "trying" really hard to build on existing open standards and leverage all the benefits that come with it.
Btw, here's an example repo from a recent talk I've given at Rejekts: https://github.com/stateful/rejekts-eu-2024
Thanks that's useful context.
So I think the partial work around today would be to use the vscode option in RunMe to specify the gRPC address of the RunME server and use a remote server. I would still use vscode locally. I plan on experimenting with this soon.
This is different from VSCode in ssh because the filesystem where notebooks are stored and the server where commands are running are colocated.
So I think the partial work around today would be to use the vscode option in RunMe to specify the gRPC address of the RunME server and use a remote server. I would still use vscode locally. I plan on experimenting with this soon.
This is different from VSCode in ssh because the filesystem where notebooks are stored and the server where commands are running are colocated.
Please let me know how that goes. Another experiment you could give a try is to run Runme's kernel server through an SSH tunnel which is likely not so different than using a remote socket.
@sourishkrout I've been thinking more about this. In particular, I've been wondering how much work it would be to create a minimal version to begin testing demand and utility.
Is it possible to start listing the parts of RunMe that would need to be refactored in order to make RunMe work in vscode for web?
Are there other pieces of RunMe functionality that won't work in vscode for web?
Is this to support running cells interactively? It looks like IRunnerProgramSession implements the PseudoTerminal Interface. It looks like GrpcRunnerProgramSession implements the PseudoTerminal
interface.
Would disabling the ability to run interactively in web be an easy way to deal with that?
That said it seems like this should work in VSCodeForWeb. It looks like GrpcRunnerProgramSession is mapping the terminal interface onto GRPC requests and the actual execution of the commands happens inside the RunMe gRPC server which isn't constrained by the browser. So if we switch the transport to a protocol that works in the browser then it should work?
Per this issue it seems like pseudoterminal is supported in web. https://github.com/microsoft/vscode/issues/116022
Is there a good pattern for excluding code that shouldn't be included in one of the versions?
The general reference architecture is documented here: https://code.visualstudio.com/api/extension-guides/web-extensions.
It's tough to gauge and quantify how much entanglement there is between web/node APIs. Making code work, behave, and cleanly bundle for both web/node (runme's code plus dep tree) is likely what's under the tip of the iceberg. Javascript is not Javascript. I don't just worry about the work required to detangle but to not destabilize the rock-solid parts in Runme.
If I were to tackle this, I'd likely attempt it in stages:
To be clear, though, 1. ranks low on the roadmap and 2. even lower. The reason is that the Runme users I talk to expect a complete environment and don't see/use the Notebook as a standalone user experience. In my mind, going against the grain of the "ideal user profile" with this approach is Pandora's box. It really just makes it more difficult to quickly get up and running due to a config-intense out-of-the-box experience.
I do, however, think using Runme server-less has legs. However, requiring compute/mem is inevitable whether it's as part of an IDE-based delivery model or a frontend for a remote host. The former won't need changing a single line of code, when delivered server-lessly. I believe gauging interest and proving it can be done with a packaged container image (with code-server
) that users can run locally or as part of a managed "server-less" dispatcher.
I reviewed the requirements outlined initially, and the "locality" of storing notebooks locally seems important. No? I wonder if the solution here is using more of the VS Code extension API. Virtual document/file system/workspaces come to mind. Perhaps is "easy" to build a pass-thru with the browser APIs?
https://github.com/microsoft/vscode-extension-samples/tree/main/fsprovider-sample https://github.com/microsoft/vscode-extension-samples/tree/main/fsconsumer-sample
Sorry about the extended response, but, as you can tell, I have a "frontend-backend" knee-jerk from numerous past conversations (with various devs) where I feel an entrenched understanding of architecture is driving a design/approach, not the requirements and/or the user persona. :-)
Comments on your arch/impl notes:
It is not an issue since https://vscode.dev already proves that they run "as a web app". The challenge is to find a web extension that allows running a shell (vscode.dev is entirely host-less). I believe there's an experimental WASM-compiled Python interpreter that runs fully self-contained. Another way to prove the concept is to create a tunnel and just use vscode.dev as IDE frontend.
It isn't an issue since it's running a web component per cell, and the PTY/TTY is more or less a character device abstraction agnostic from a "host system". The GRPC client abstractions are somewhat well-defined but likely not narrow enough to replace them with a Connect alternative. However, the risk here is lower because they are already loosely coupled. Again, I'm struggling to see the merit in porting the transport before having a solid understanding how to deal with the host of issue of the out-of-the-box experience.
Thank you for the detailed response.
I'd like to separate the questions of
For how much work would it be, it sounds like there is a lot of unknowns. In particular,
Do you have suggestions for what a time bounded way of getting more clarity on 1 would be?
If I were to tackle this, I'd likely attempt it in stages:
I think this is a great suggestion.
The challenge is to find a web extension that allows running a shell (vscode.dev is entirely host-less).
Why would you need to run a shell in the browser? My assumption is the browser is still communicating with the fully functional RunMe server which is running outside the browser.
The reason is that the Runme users I talk to expect a complete environment and don't see/use the Notebook as a standalone user experience.
Thats interesting. What about developers that aren't using VSCode?
Do the users you talk to already have access to a cloud development environment? e.g. code-server, github workspaces, etc...?
So a question I keep coming back to is, if I was a platform team how would I create a paved path for reading/writing/executing runbooks?
I can think of three options
I see drawbacks to each of them.
kubectl apply -f
then you would still need to download and run runme locallySo given the above none of the options are great, IMO.
I really like your suggestion about creating a render-only Notebook UX available on the Web because if we have some minimal web based experience then its possible to incrementally improve it.
I hope to be submitting a pr in a few weeks - i am working on creating a serializer that will use the connect protocol that would unblock this from being able to run in vs code for web.
Feature Request:
Make RunMe vscode extension compatible with vscode for web.
This is different from the existing RunOnMe on Web experience because the current experience relies on codeserver which means
Running vscode on the web means the RunMe vscode extension would be running in the browser and accessing notebooks on the machine where the browser is running and accessing a remote RunMe GoLang server. See the diagram below.
This has come up in a couple discord threads https://discord.com/channels/1102639988832735374/1102639989700968583/1250577804618371132 https://discord.com/channels/1102639988832735374/1102639989700968583/1232903457912913991
Motivation
Here's a diagram illustrating how I would like to deploy RunMe.
Notably here
This solves a couple key pain points related to runbooks and infrastructure
VPC perimeters and bastion nodes
Enterprise infrastructure usually lives within a secure VPC. This means to run the steps in a playbook (e.g. kubectl, gcloud, awscli commands etc...), you usually have to tunnel into a machine within the perimeter.
With the proposed architecture users could open up RunMe notebooks in their browser and then execute those commands inside a machine inside the VPC via RunMe's GoLang server. The notebooks would be stored locally.
Reproducible / Containerized Environments
A major headache today with operations is that each developer has to install and configure all the tools used in playbooks. The above architecture means this can be replaced with containerized environments provisioned with role accounts. All that needs to be installed on the client is a browser and network access.
Remote Debugging
When troubleshooting problems with VMs or Containers its often necessary to execute commands within those containers or environments. The above architecture would mean we could start the RunMe server inside the target (E.g. as a K8s ephemeral container) and then execute parts of our playbooks inside that machine.
Importance of storing notebooks locally
A critical difference with today's RunMe on web is that in the proposed architecture notebooks are stored on the machine where the browser runs. This simplifies deployment of the server because the RunMe GoLang server can effectively be treated as stateless whereas a codeserver is stateful. In particular, since the RunMe notebooks are stored in the codeserver server the codeserver server can't be recycled until the notebooks have been persisted to some durable storage (e.g. git).
Known Blockers
There are two main blockers I'm aware of to making RunMe compatible with vscode for web
RunMe's vscode extension uses gRPC to communicate with the GoLang server. gRPC can't run inside the web browser. There are 3 possible options
I think the connect protocol is the most promising. The other two require running a proxy. Using the connect protocol also means you can continue to use buf's generated clients.
Currently, RunMe serializes/deserializes notebooks inside the GoLang server. I believe there was an experiment to run serialization in the browser using WASM. However, it looks like the WASM code path is being removed (stateful/vscode-runme#349).
For the above architecture to work well, you want serialization to run in the browser client side so opening/saving notebooks isn't blocked on provisioning a RunMe server.