Alternative Backends (Minivm, Wasm, LLVM)

slightknack commented 2 years ago

I may write some ideas down for codegen for multiple backends later. Requires some sort of typed (or type-erased) low level IR format. I plan to move away from the current Lambda + Closure format once we introduce the Qualm distributed runtime (formerly FLEx).

dumblob commented 2 years ago

Just curious - do you have any description of the plans regarding Qualm/FLEx distributed runtime? Any repository? Is it something akin to Unison?

slightknack commented 2 years ago

I have an old repository on my main account that implements most of the heap/vm https://github.com/slightknack/flex. (I demo'd the repository a while back.) Locally I've developed it a bit further, but don't intend to publish a new repo as qualm under this org until big-refactor is merged.

You could probably describe Qualm it as a mixture of Unison, Lunatic, and Beam. I'll try to sketch out the core idea here for those interested, but note that Qualm is currently on the backburner until I get big-refactor merged and have time to focus on updating the runtime.

The high-level idea is simple: Write programs that scale to any level of compute by running the self-contained qualm runtime on a set of machines. You can push out work by sending content-addressed compiled functions to a node, and it will be run in parallel and distributed across machines in the cluster as needed.

Qualm is built on a minimal assembly-like format (similar to Wasm or MiniVM bytecode) that operates at the granularity of functions. These functions are content-addressed and easily distributed between nodes, as each function can be serialized to a canonical compact binary format.

Each node can run up to thousands/millions of fibers, each of which has an independent stack, heap, and set of capabilities. Because fibers are self-contained, they can be snapshotted, serialized, and sent between machines.

Nodes keep content-addressed functions in a shared store which individual fibers running on that node can reference. If a node does not have a function a fiber needs, the node can grab the function from its peers (as functions are content-addressed).

I've implemented a simple interpreter for qualm assembly, but it might be more efficient for nodes to translate it to Wasm/MiniVM, or directly compile functions to native code. The hope is that nodes can be updated incrementally and run different implementations compatible with Qualm while maintaining compatibility keeping long-running processes alive between updates.

Nodes provide a Passerine repl for adding/removing nodes, managing running fibers, testing out functions, etc. Running aspen run will spin up a single-machine parallelized qualm runtime, but if the project is configured to be a part of a cluster, the qualm runtime will connect to the other nodes present in the cluster and run it that way.

From the passerine side of things, Qualm will be a single OPT-like system injection interface built on top of algebraic effects. This interface will contain a number of primitives like 'start new fiber', 'snapshot fiber', 'run group of fibers in parallel', etc. It's up to the runtime implementation to determine how to actually parallelize things (and I have some cool ideas here, like how to determine whether it's more efficient to run fibers in parallel or take the network hit and distribute across multiple nodes).

There is, of course, a lot of detail I'm glossing over here, like the cryptography scheme for signing work and adding new nodes, exactly what the wire format looks like, what's special about the heap implementation (it supports vaporization!), how the repl works, what HTTP/IO/etc. looks like, etc.

You may have also noticed that Qualm is not a compiler; because of the compatibility of the qualm assembly format with the likes of Wasm and MiniVM, I'm hoping that eventually other languages will be able to run on top of qualm, with Qualm serving as a sort of langauge-server analogue for compute.

We'll see how things go, of course. Currently I'm in the middle of big-refactor. I've made a lot of progress which I'll hopefully update y'all about soon after I publish the couple of blog posts in the pipeline about the new FFI built around system injection and what's new in the upcoming release.

dumblob commented 2 years ago

Interesting. Thanks for the primer. Will stay tuned to see the new blog posts first before commenting on anything :wink:.

vrtbl / passerine

Alternative Backends (Minivm, Wasm, LLVM) #67