wetware / pkg

Peer-to-peer cloud environment
https://wetware.run
Other
37 stars 7 forks source link

Basic ref-counted Anchor implementation #108

Closed lthibault closed 1 year ago

lthibault commented 1 year ago

This PR implements the basic functionality for the Anchor capability. It addresses https://github.com/wetware/ww/issues/57.

Anchors are a key feature of Wetware's public API, so I am going to use this PR as an opportunity to get our newest contributors up-to-speed on their purpose and implementation. I invite everybody to review this PR, and ask any questions they may have about this important capability.

cc @mikelsr @ayazabbas @evan-schott

A Short Primer on Anchors

What are Anchors?

Anchors are a model for distributed shared-memory. They allow Wetware servers to export a Chan or Proc, making it available to peers over the network. At the highest possible level, the Anchor API gives us a way of fetching a Chan or Proc along a particular path, in a particular server.

Example

Consider the absolute path /f19caa21dda41c43/foo/bar. This path references the memory location along the relative path foo/bar within peer f19caa21dda41c43.

Note the following properties:

  1. The first component of an absolute path uniquely identifies a server instance.
  2. Two absolute paths with the same first component point to memory locations on the same host.

Together, these properties allow us to express colocation. A Chan<- located at /f19caa21dda41c43/mailbox lives on the same physical host as the process located at /f19caa21dda41c43/actor!

Why do we need Anchors?

Why do we need distributed shared-memory? Aren't we supposed to use communicating-sequential processes (CSP)?

Yes, applications should use CSP (i.e. channels & processes) as much as possible because it is safer and easier to reason about in a distributed environment. However, applications still need a mechanism for sharing channels with each other! To do so, the application code (think: a WASM process) creates a channel, and then assigns it to a particular path.

In pseudocode:

# Create a buffered channel.
inbox = mk_chan(buf=16)

# Store the send-end of the channel in the anchor "/f19caa21dda41c43/inbox".
#
# Note the root anchor (path "/").  It represents the whole cluster.  Walking to
# the path "/<server-id>" returns the anchor for the corresponding peer.
root.walk("/f19caa21dda41c43/mailbox").store(inbox.asSender())

# process incoming messages
while True:
    message = inbox.recv()
    handle(message)

A completely separate peer can now do the following (pseudocode):

# Load the stored channel from the anchor "/f19caa21dda41c43/mailbox".
mailbox = root.walk("/f19caa21dda41c43/inbox").load()

# Send a message to to the remote process.
mailbox.send("hello, world!")

Again, CSP is the preferred way of passing data between processes and peers, and anchors are used as an application-defined rendezvous point where a channel can be shared or obtained. Ideally, applications should only use anchors to propagate channels. In practice, there are cases where it is both safe and convenient to share processes as well, so we allow that too.

The Anchor Capability

Anchors are represented by the Anchor capability. The schema is located in api/anchor.capnp (note that it is currently incomplete).

As we've already seen, Anchors form a tree. The operation a.Walk("/x/y/z") operation returns the anchor along the relative path under a. The operation a.Walk("/") (or, equivalently, a.Walk("")) return a. Nodes are created dynamically when a path is walked, and are automatically removed from the tree when:

  1. a client or process is no longer using it (i.e. all capability clients pointing to it are released),
  2. it has no children; and,
  3. it is not storing a value.

Importantly, there is no operation operation to walk up the tree, so there is no way for an Anchor to access its parent. This is intentional. It allows applications or subroutines to be confined to a subtree, providing a cooperative mechanism for isolating application state.

From a security standpoint, the cooperative nature of this isolation makes anchors somewhat sensitive. Untrusted code generally should not have access to the root anchor, or even to a host anchor. It should instead be confined to an application-specific sub-anchor, e.g. /f19caa21dda41c43/my-app. This is best achieved through a "nursery pattern", where a privileged process is responsible for spawning non-privileged processes, and providing them with the appropriate sub-anchor. Note that I have plans for more fine-grained access control (e.g. read-only anchors, anchors without a Walk method, etc.), but that these are beyond the scope of this overview.

Implementation

Now that we have a general understanding of the purpose and public API for anchors, let us briefly discuss implementation details. This section is should be helpful for understanding the present PR.

The Anchor capability is implemented by pkg/anchor.Node. Each node has a counter that tracks references (the "refcount"), which is initially set to zero. Nodes remove themselves from their parent's children map[string]*Node when the refcount decrements from 1 to 0.

There are exactly two operations that increment the refcount:

  1. n.Child(name) increments the refcount iff a child under the relative path name does not exist.
  2. n.Anchor() increments the refcount iff there is no capnp server running for that node.

There are exactly two operations that decrement the refcount:

  1. A node decrements its parent's refcount when removing itself from the parent's set of children.
  2. pkg/anchor.server.Shutdown() always decrements the refcount of server's underlying Node. It is automatically called by the capnp library when the final Anchor client for the server has been released.

NOTE: As an implementation detail, server holds exactly one reference to the the underlying Node. The server itself is refcounted separately using capnp.WeakClient. When the server's refcount hits zero (= all clients have been released), server.Shutdown() releases the unique reference to the node. This is morally equivalent to incrementing Node's refcount for each Anchor client.

Next Steps

lthibault commented 1 year ago

@ayazabbas Thanks for the review, and for the questions 😃

CSP

Question 1: Is that a reasonable way to think about this or am I missing something more critical? Is there anything else you would suggest I read?

It's not wrong, but it feels a bit narrow, so let me see if I can reframe.

Reading between the lines, it seems like you're trying to answer the question "CSP vs what?" That is, we are picking CSP over some alternative; what is that alternative? The answer is: CSP is an alternative to shared memory & mutexes. The entire point is to enable concurrency without locking.

From there, it's pretty intuitive. To have CSP, you need two things:

  1. Some kind of concurrent process
  2. Some kind of channel that can pass data from process to process

You are already familiar with CSP: Unix processes and pipes are CSP, goroutines and chans are CSP, threads passing data to each other on a queue is CSP.

I'm hoping that with this starting point, the rest kinda falls into place, but please follow up if not 😃

My immediate thought was that CSP seems very similar to PubSub [...]

PubSub is arguably related to channels insofar as it's IPC that doesn't involve locking. If you squint at it just right, you could rightly call a pubsub topic a "multicast channel". I think this is a fine way to interpret it. In fact, it gets at the heart of why we have both PubSub and channels: they do roughly the same job, but make different trade-offs:

  1. channels are unicast; pubsub is multicast
  2. channels are FIFO-ordered; pubsub is unordered
  3. channels are reliable (with some caveats); pubsub is unreliable

The decision to use one vs the other pretty much boils down to the messaging patterns required by the application.

you can't put to the channel if there is no one to take

This is true of synchronous (unbuffered) channels. Just like Go, Wetware allows you to construct buffered channels that can hold up to n items. A send to a buffered channel only blocks if the buffer is full, else it enqueues the value and moves on.

So channels are somewhat ephemeral

I don't think I would agree with this characterization. A channel will usually last as long as the process(es) that are using it. Here I think it's helpful to think of goroutines. You can have a pair of long-running goroutines that communicate on one or more channels. These channels will stick around until both goroutines exit. The same principle applies here.

Here's a simple example: imagine a worker that services some kind of request. The worker has two parts: a process that does some work, and a send-only channel that others can use to enqueue work. The channel can be located e.g. at /09a1f3455e238d73/worker-42. Now, any other process in the cluster can grab that channel and enqueue work. Clearly, this channel should be bound to the lifetime of the worker process.

Is there anything else you would suggest I read?

I really wouldn't overthink CSP, but if you're curious about some of the CS theory, I would recommend Tony Hoare's paper. Note that we don't support the full set of classic CSP operations (in particular, we don't have an equivalent to Go's select, because "consistency").

Beyond that, I would just keep in mind a couple of properties that make CSP such an appealing choice for distributed computing:

  1. There is no memory sharing. Values are passed by copying. This means there are no locks (!)
  2. Producers and consumers are decoupled. One doesn't have to know anything about the other.

Anchors

So an anchor is like a cluster-scope filepath that can be used by an application to expose a channel, making it available for other applications (or other instances of the same application?) to write to the channel and read from the channel.

Yes on all counts.

Nit-picking a little bit, it's not technically a file path because this is totally independent of the filesystem. I think a better comparison is a URL path component. It's an abstract path; you neither know nor care about the underlying implementation. You just know that it identifies a particular resource on a particular host.

But still, yes.

Question 2: In your pseudocode example, the peer id is hardcoded, but applications shouldn't necessarily "care" about what peer the channel of interest is on right? So once an anchor is created, how would interested parties find out about its existence and know where to find it?

Good question. There is a method on each anchor called ls. It lists the names of any children.

To find the names of all the hosts in a cluster, you would do the following in pseudocode:

for host in range root.ls():
    print(host)  # prints e.g. "/f19caa21dda41c43"

From there, you can do root.walk("/f19caa21dda41c43").ls() to see host /f19caa21dda41c43's existing children.

Note also: ww client ls / at the command-line is equivalent to root.ls(). (Listing sub-anchors is WIP.)

in Kubernetes you submit your application to the cluster and the cluster chooses where those containers run, is wetware fundamentally different? Is it up to the user to control what runs on each peer?

Correct. Wetware is low-level cluster middleware. In K8s and other PaaS systems, you tend to write declarative configurations and hand these off to some sort of orchestrator/scheduler service. In Wetware, you write programs that can react to changes in the distributed environment. For instance: you can react to a host joining/leaving the cluster, or to an event that has been sent to a channel or published to a topic. You can react to a process terminating by spawning a new one, etc, etc.

The trade-off is obviously that Wetware is a bit of a sharp tool, a la C. It's easy to write incorrect programs, because there are few guard-rails. In particular: Wetware is a PA/EL system. However, it hopefully makes it easier for people who know what they're doing to write correct distributed systems, because you're handed convenient primitives and cluster-level services.

On a final note, the anchor API is ~shamelessly stolen~ heavily inspired by Circuit. Perusing the docs might help you get a holistic sense of where this is all going.

lthibault commented 1 year ago

Merged, but happy to continue the discussion, whether here or on Matrix.