wetware / pkg

Peer-to-peer cloud environment
https://wetware.run
Other
38 stars 7 forks source link

Stream Anchor capabilities when iterating over cluster view #40

Open lthibault opened 2 years ago

lthibault commented 2 years ago

The current host-anchor implementation introduces a fair bit of complexity that could (in time) be handled by capnp's 3PH.

@aratz-lasa In the short-term, the proposed changes have moderate but direct impact on downstream consumers, so I want to be sure we take the time to discuss this. In particular, I want to make sure we aren't going to undermine any of our existing projects (or that we have good workarounds until 3PH lands).

Problem

Consider the following code.

it := n.Ls(ctx)  // n is a client.Node
for it.Next() {
    if host := it.Anchor(); hasSomeProperty(host) {
        doFoo(host.Walk("/foo"))
    }
}

We can ignore the details of how we are selecting hosts, and what we are doing with them. The essential part is that we performing doFoo on only some of the hosts in the cluster. Since we are not performing operations on every host, we do not need to create an Anchor capability for each host This creates an opportunity for significant optimization because creating an Anchor capability involves two sub-operations that become costly at scale:

  1. establishing network connections between peers; and,
  2. mutating the cap table on both peers (amortized O(1) time, and O(n) space).

Current Solution and Limitations

Rather than stream Anchor capabilities back to the client, we stream records that contain routing information for each host in the global view. We then use this routing information to construct a special Anchor implementation that lazily dials its remote host when its methods are called for the first time. This is a "perfect" optimization, since it avoids both sub-operations unless the anchor capability is actually being used. It neither creates surplus network connections, nor modifies cap-tables needlessly.

The main drawback is that this "lazy-dial" approach involves extra state management. It requires us to

  1. track the connection state of each host anchor,
  2. maintain a reference to the underlying rpc.Conn after it has been established, and manage its lifecycle; and,
  3. intermix error-handling logic for the networking, session and application layers.

Overall, this approach comes at the cost of extra state-management, as well as a mild blurring of system boundaries. It also has the unfortunate effect of placing this extra complexity in high-level code (pkg/client rather than, say, pkg/vat/ocap).

Solution

If we are willing to tolerate additional load on the cap table, I think it is possible to both

  1. simplify the host dialing logic; and,
  2. move state-management to lower-level packages.

To do so, the host servicing the the View RPC call need only associate an Anchor capability with each record streamed back to the caller. As before, superfluous network connections are avoided by lazy dialing. Dialing logic is simplified by delegating the construction and lifecycle-management of remote host-anchor capabilities to the cluster.Host type, which

Caveats and Mitigation

Cap-Table Contention

As noted above, the proposed solution adds entries to both the sender's and the receiver's cap tables for each record that is transmitted by a call to Host.View().Iter(). In the worst-case analysis, this exhibits O(n) complexity both in overall memory usage and heap-object count. Note that the cap tables at both ends of an rpc.Conn are affected. This problem is however attenuated by

  1. the small size of clusters currently in production,
  2. flow-control properties of our batch-streaming API; and,
  3. the existence of passive mitigation strategies ranging from sync.Pool to the use of specialized datastructures in the rpc.Conn cap table.

To this first point, we can expect the size of the cap table to stabilize on some asymptotic value for large routing tables as unused capabilities from previous batches are released. The exact value of this asymptote is likely a simple function of batch-size and network RTT. An arbitrary upper bound can therefore be enforced through go-capnp's existing flow-control API.

More generally, full table scans are inherently O(n), so it's expected that applications will try to avoid this by filtering the view on the server-side, in a manner analogous to classical DB queries. To this end, our first line of defense is the enriched query API proposed in #36.

It should lastly be noted that rpc.Conn is undergoing heavy development, and that opportunities for improving performance (e.g. through reduced lock contention) almost certain to emerge.

Object Proxying and Third-Party Handoff

An important side-effect of the proposed refactoring is that all calls to anchors obtained via the View capability will be proxied through its host. In practice, this means proxying through the host to which a given client.Node is connected.

This is a perfect target for Cap'n Proto's "Third-Party Handoff" (3PH), which can transparently reduce the network path to a single hop. Level-3 RPC support in go-capnproto is planned, and implementation efforts are estimated to begin in Q1 of 2023.

In the meantime, the main factor to consider is that the proposed solution implies a commitment to 3PH in the medium-term future. The acute need for 3PH will manifest as application-level stability issues due to a single point-of-failure, and to a lesser extent as high latency due to the proxying of RPC calls.

lthibault commented 2 years ago

Another consideration is security. Presently, anchor.Capability is exported directly at the vat.Network level, in its own stream handler. This means that it can be arbitrarily bootstrapped by any peer capable of dialing the vat's underlying libp2p Host, which in turn makes it trivial to escape confinement to a particular anchor subtree.

Obviously this isn't a problem until we actually implement authentication for client capabilities, but it does influence the design decisions for #16. As noted in that issue, per-capability stream handlers may improve performance by taking full advantage of non-blocking QUIC streams. On the other hand, they almost inevitably increase the size of the auth boundary, since each libp2p protocol endpoint must be guarded individually.1 For this reason, I am increasingly opposed to this approach.

To recap, the "pros" for streaming anchor capabilities directly from View are now:

The "cons" are now:


  1. I have become somewhat skeptical of my previous claim that this necessarily increases the ambient authority boundary. Each endpoint could, in principle, be guarded by some kind capability-based access token, such that each token is derived from a single source of ambient authority. Whether or not users and administrators will actually respect this design principle is another question altogether.