MatrixAI / Polykey

Polykey Core Library - Open Source Decentralized Secret Sharing System for Zero Trust Delegation of Authority
https://polykey.com
GNU General Public License v3.0
27 stars 4 forks source link

Gestalt Synchronisation for ACL Configuration, Notifications and Vault Automation #715

Open CMCDragonkai opened 1 week ago

CMCDragonkai commented 1 week ago

This is a high-level issue, a large epic with multiple branches.

Requirements of this design

  1. The gestalt represents the combined digital identity of all Polykey nodes and their claimed identities. This means the gestalt is a graph, and this graph is the the amorphous entity that which external vault sharing, internal vault automation, delay-tolerant notifications, and shared ACL configuration relies upon.
  2. Acceptance testing and simulation of decentralized gestalt discovery and indexing. This means all nodes within a gestalt share information on parts of the gestalt graph that they are performing social discovery on. This enables a gestalt of Polykey nodes to shard the workload. They simultaneously maintain equal copies of the gestalt graph, but they can choose different parts of the graph to index together. This requires a consistent hashing distribution - and decentralized optimisation/decision making.
  3. Gestalt graph pruning is necessary - in terms of garbage collecting parts of the gestalt graph that are not longer relevant to optimise resource-usage, but also in terms of expulsion of a node from a gestalt.
  4. The ability to remove a node or identity from a gestalt that is time-domain dependent. This is important as revocation of a node or identity is necessary to protect against compromised social identities (or nodes). See how the @SECGov account was compromised for the announcement of bitcoin etf approvals, and also various takeovers of DEFI projects such as Rocketpool's twitter account and Dune.com twitter takeover.

    ![image](https://github.com/MatrixAI/Polykey/assets/640797/7690c7ae-befe-4ebd-86b0-8cac4d18c84a) ![image](https://github.com/MatrixAI/Polykey/assets/640797/33d65a84-a3c3-4576-888c-40aa0574ba39) > https://discord.com/channels/405159462932971535/405163979141545995/1202034031563112478
  5. The ACL is a configuration data indicating the assigned permissions to any given gestalt in the graph. These permissions must be shared across all nodes in the same gestalt. This means every node in the gestalt must completely synchronise the ACL dataset, so that all nodes are aware of any permission set in any other node in the gestalt. This means we are in a master-master replication situation. There are 2 ways we can do this - optimistically and pessimistically. Both should be possible - where the user decides.
  6. Unify notifications into a delay-tolerant messaging system within and without the gestalt. This means notifications should follow the async messaging design standard, while RPC calls represent synchronous messaging. The order of abstraction goes like this:
    • RPC calls
    • Notification RPC Calls
    • Async Notifications by combining notifications with task manager
    • Delay tolerant async notifications
    • Present both library SDK usage - for internal messaging, and user-level messaging polykey notifications CLI

Additional context

Specification

Gestalt Union and Difference

Currently a Gestalt can be unioned. This means gestalts grow by unionising smaller gestalts. A singleton Polykey node or singleton identity forms a gestalt. It is capable of joining another gestalt through the "claiming" process (see https://github.com/MatrixAI/Polykey/issues/702). This results in a few interesting possibilities:

Dealing with conflicts - either the union have no conflicts, or that conflicts must be settled via a tie-breaker (optimistically, with a log showing what was unioned):

However it is also required that we can kick out a vertex from a gestalt graph. However this is more complicated than one can imagine. Firstly because by removing a specific vertex in a loosely connected graph, it's possible that the graph is actually divided. We end up with a split-brain problem, where 2 parts (or even more than 2 parts) may all now claim they are the real spiderman.

image

This is similar to a sort of blockchain fork situation. Which gestalt do we consider the "real" gestalt. Well if we reframe the problem, each node that continues to be part of their gestalt will believe that they are the "real" gestalt. Therefore a breakup that occurs on any vertex, is actually a breakup of a claim link.

That means, revocation of a node is not a matter of eliminating the node, it's a matter of eliminating a subset or the total set claim links to that node. And you cannot reverse the past, but you can only go forward into the future. So similar to having a certificate log, a particular node must choose to revocate a prior claim that first established the union. This might not actually remove the node from it.

Here's how it can work. Any user of any node that exists in a gestalt, is supposed to have the authority over all other nodes in the same gestalt. There's no distinction. Let's suppose we want to eliminate a Twitter identity vertex. That means, any user of the node, must then elect to publish a revocated claim ("I disclaim this identity"). However does the publishing of this (which is on the sigchain) need to be on the initial node that claimed it in the first place? And how does this affect say multiple nodes claiming the same identity?

Is it that individual nodes disclaim a prior claim (and how does this deal with multiple claims to the same identity?)

Disclaiming by Single Claim

Or is that a gestalt focuses on the vertex itself and publishes something that overrides all other prior claims?

Disclaming by Focusing on Vertex

Also when there are multiple claims, the gestalt graph should only consider the most recent as the legitimate. This is important from a disclaiming perspective, since a subsequent disclaim would take precedence over claims.

Node multiple Claims to Identity

ACL Sync

There are 2 main kinds of permissions, that which relate to the gestalt, and that which relates to vaults. A permission is assigned to the entire gestalt, and not to individual nodes or identities in the gestalt. Permissions that relate to gestalt apply to any node in that gestalt. Whereas permissions in relation to vaults are further indexed by the vault ID.

Currently it depends on the definition of 2 types representing permission-action tokens: GestaltActions and VaultActions. The meaning of each permission token is defined in their relative domains. These permission tokens are simlar to OAuth scopes. These scopes are just arbitrary token strings. They are not self-describing, instead the represent attributes in an ABAC sort of system.

The definition of these attributes is defined separately, but they must come together into a single domain - for example acl/types.ts and be exported together with a complete set. Rather than creating a map for indexed permission tokens, we could instead flatten them into a serialised system. For example:

scan, notify, pull:vault:1235, clone:vault:abcefg

This would make it closer to OAuth like scopes, and may make it easier for us to integrate smart token logic (potentially using the biscuit system).

type Permission = {
  gestalt: GestaltActions;
  vaults: Record<VaultIdString, VaultActions>;
};

Given the capability design of Polykey's ecosystem, why re-introduce an ACL? It's possible that an ACL here represents the easiest way of assigning permissions to external systems - by means of the giver, rather than the receiver. That is providing a bearer token to authorise access... you instead set rules, because each Polykey individual node is a centralized system with respect to its internal resources. Bearer tokens make the most sense in decentralized resources...

It's possible that we design a basic ACL first that is synchronisable. But then build on top of that a bearer token logic system - of which its most primitive form is simply possession of valid scopes, and later a fragment of logic that gets executed as part of a larger logic system.

I'm a fan of eventual consistency in the context of decentralized systems, especially considering that Polykey can be a delay-tolerant system, where parts of the gestalt may be offline, and therefore the master cannot guarantee that all nodes have retrieved the latest ACL.

This means if a user were to request an atomic ACL change, then notifications would be used - to ensure that all nodes have confirmed before confirming the ACL change itself. The user would then move onto other tasks. Alternatively the user can just accept the asynchronous nature of the ACL change. In that case, the Polykey program will accept the ACL rule.

Notification Sync

The basic idea is that every Polykey node in a gestalt are just points of presence of the total identity.

Therefore when a node of gestalt A sends a notification to gestalt B, it's really sending that notification ALL nodes of gestalt B.

Sub-Issues & Sub-PRs created

  1. ...
  2. ...
  3. ...
linear[bot] commented 1 week ago

ENG-307 Gestalt Synchronisation for ACL Configuration, Notifications and Vault Automation