stateful / runme

DevOps Workflows Built with Markdown
https://runme.dev
Apache License 2.0
901 stars 30 forks source link

vscode for web and serverless RunMe #616

Open jlewi opened 2 weeks ago

jlewi commented 2 weeks ago

Feature Request:

Make RunMe vscode extension compatible with vscode for web.

This is different from the existing RunOnMe on Web experience because the current experience relies on codeserver which means

Running vscode on the web means the RunMe vscode extension would be running in the browser and accessing notebooks on the machine where the browser is running and accessing a remote RunMe GoLang server. See the diagram below.

This has come up in a couple discord threads https://discord.com/channels/1102639988832735374/1102639989700968583/1250577804618371132 https://discord.com/channels/1102639988832735374/1102639989700968583/1232903457912913991

Motivation

Here's a diagram illustrating how I would like to deploy RunMe.

shapes at 24-05-23 07 59 47

Notably here

This solves a couple key pain points related to runbooks and infrastructure

VPC perimeters and bastion nodes

Enterprise infrastructure usually lives within a secure VPC. This means to run the steps in a playbook (e.g. kubectl, gcloud, awscli commands etc...), you usually have to tunnel into a machine within the perimeter.

With the proposed architecture users could open up RunMe notebooks in their browser and then execute those commands inside a machine inside the VPC via RunMe's GoLang server. The notebooks would be stored locally.

Reproducible / Containerized Environments

A major headache today with operations is that each developer has to install and configure all the tools used in playbooks. The above architecture means this can be replaced with containerized environments provisioned with role accounts. All that needs to be installed on the client is a browser and network access.

Remote Debugging

When troubleshooting problems with VMs or Containers its often necessary to execute commands within those containers or environments. The above architecture would mean we could start the RunMe server inside the target (E.g. as a K8s ephemeral container) and then execute parts of our playbooks inside that machine.

Importance of storing notebooks locally

A critical difference with today's RunMe on web is that in the proposed architecture notebooks are stored on the machine where the browser runs. This simplifies deployment of the server because the RunMe GoLang server can effectively be treated as stateless whereas a codeserver is stateful. In particular, since the RunMe notebooks are stored in the codeserver server the codeserver server can't be recycled until the notebooks have been persisted to some durable storage (e.g. git).

Known Blockers

There are two main blockers I'm aware of to making RunMe compatible with vscode for web

  1. gRPC
  2. Moving serialization into the browser

RunMe's vscode extension uses gRPC to communicate with the GoLang server. gRPC can't run inside the web browser. There are 3 possible options

  1. Use buf's connect protocol
  2. Use grpc-web
  3. Use grpc-gateway

I think the connect protocol is the most promising. The other two require running a proxy. Using the connect protocol also means you can continue to use buf's generated clients.

Currently, RunMe serializes/deserializes notebooks inside the GoLang server. I believe there was an experiment to run serialization in the browser using WASM. However, it looks like the WASM code path is being removed (stateful/vscode-runme#349).

For the above architecture to work well, you want serialization to run in the browser client side so opening/saving notebooks isn't blocked on provisioning a RunMe server.

sourishkrout commented 2 weeks ago

Currently, RunMe serializes/deserializes notebooks inside the GoLang server. I believe there was an experiment to run serialization in the browser using WASM. However, it looks like the WASM code path is being removed (stateful/vscode-runme#349).

WASM pre-dated the gRPC (de-)serializer. While it worked great in the VS Code desktop app we ran into runaway memory issues in remote execution environments. This was likely a side-effect from cross-compiling the Runme CLI to WASM instead of breaking out the Serializer into a discrete library. In any case, WASM was abandoned because the Runme kernel's role expanded and needed to live much closer to the system (io, tty, etc).

Known Blockers

There are two main blockers I'm aware of to making RunMe compatible with vscode for web

  1. gRPC
  2. Moving serialization into the browser

While not impossible from a pure technological point of view, the VS Code platform has some firm boundaries here. Many of the APIs (e.g. terminal, code lenses, file system operations, etc) as well as "work out of the box" like "product experience" are provided by VS Code's Extension Host APIs. These are vastly limited for what VS Code calls "Web Extensions" which basically limits them to either read-only rendering or something purely WASM-based: https://code.visualstudio.com/api/extension-guides/web-extensions & https://github.com/microsoft/vscode-extension-samples (search for directories containing wasm).

While one might think VS Code is "just a webapp", it's really a IDE micro-services architecture and operating outside of architectural boundaries comes with VS Code Platform incompatibility as well as delivery/packing tradeoffs.

sourishkrout commented 2 weeks ago

Here's a diagram illustrating how I would like to deploy RunMe.

One light-lift option for a serverless deployment could be VS Code's tunneling capabilities. And, code-server (by Coder) or the Theia project might have some more flexible answers which I haven't fully explored.

E.g. The GHA we've built that let's you drop into a tunneled VS Code like a "breakpoint" in workflow: https://github.com/stateful/vscode-server-action/blob/main/src/main.ts#L45-L51

I believe MSFT's licensing restricts the tunnels for non-commercial use but otherwise should allow for open/public delivery.

Not saying we don't want "serverless" ourselves, however, just trying to offer alternatives that are available "now".

sourishkrout commented 2 weeks ago

VPC perimeters and bastion nodes

Enterprise infrastructure usually lives within a secure VPC. This means to run the steps in a playbook (e.g. kubectl, gcloud, awscli commands etc...), you usually have to tunnel into a machine within the perimeter.

With the proposed architecture users could open up RunMe notebooks in their browser and then execute those commands inside a machine inside the VPC via RunMe's GoLang server. The notebooks would be stored locally.

Reproducible / Containerized Environments

A major headache today with operations is that each developer has to install and configure all the tools used in playbooks. The above architecture means this can be replaced with containerized environments provisioned with role accounts. All that needs to be installed on the client is a browser and network access.

Remote Debugging

When troubleshooting problems with VMs or Containers its often necessary to execute commands within those containers or environments. The above architecture would mean we could start the RunMe server inside the target (E.g. as a K8s ephemeral container) and then execute parts of our playbooks inside that machine.

Runme by extension VS Code's Remote Development capabilities supports these scenarios in multiple ways. They are well maintained by MSFT and Docker and have whether we "believe" it or not come with credibility of Microsoft' & Docker's brands.

  1. Attach VS Code to Bastion via SSH: https://docs.runme.dev/how-runme-works/runme-via-ssh#how-to-set-up-ssh-connection-in-vs-code No change in the Runme notebook UX changes, except that now the host system is your jumphost.

  2. First-class Devcontainer Support: https://docs.runme.dev/guide/devcontainer I call this "opscontainer" but the idea is the same as SSH. Instead of the bastion host, you run against a locally hosted container.

While we are open to improve the engineer's experience we're "trying" really hard to build on existing open standards and leverage all the benefits that come with it.

Btw, here's an example repo from a recent talk I've given at Rejekts: https://github.com/stateful/rejekts-eu-2024

jlewi commented 2 weeks ago

Thanks that's useful context.

So I think the partial work around today would be to use the vscode option in RunMe to specify the gRPC address of the RunME server and use a remote server. I would still use vscode locally. I plan on experimenting with this soon.

This is different from VSCode in ssh because the filesystem where notebooks are stored and the server where commands are running are colocated.

sourishkrout commented 2 weeks ago

So I think the partial work around today would be to use the vscode option in RunMe to specify the gRPC address of the RunME server and use a remote server. I would still use vscode locally. I plan on experimenting with this soon.

This is different from VSCode in ssh because the filesystem where notebooks are stored and the server where commands are running are colocated.

Please let me know how that goes. Another experiment you could give a try is to run Runme's kernel server through an SSH tunnel which is likely not so different than using a remote socket.