Design Notes on a Willow Store Trait: Mutation

For our rust implementation of Willow, we are designing a trait to abstract over data stores. Among the features that a store must provide are ingesting new entries, querying areas, resolving payloads, and subscribing to change notifications. Turns out this trait becomes rather involved. In this writeup, I want to focus on a small subset of the trait: everything that allows user code to mutate a data store.

On the surface, Willow stores support only a single mutating operation: ingesting new entries. Does the following trait (heavily simplified on the type-level) do the trick?

trait Store {
    async fn ingest(&mut self, entry: Entry) -> Result<(), StoreError>;
}

Nope, there is actually a whole lot more to consider. We'll start with something simple: while the data model only considers adding new entries to a store, there is a second operation that all implementations should support: locally deleting entries. We want to support this both for explicit entries, and for whole areas:

trait Store {
    async fn ingest(&mut self, entry: Entry) -> Result<(), StoreError>;
    async fn forget(&mut self, entry: Entry) -> Result<(), StoreError>;
    async fn forget_area(&mut self, area: Area3d) -> Result<(), StoreError>;
    // We also need ingestion and forgetting of payloads,
    // but we'll leave that for later.
}

Now we have a sketch of a somewhat useable API, but it does not admit particularly efficient implementations. We want stores to (potentially) be backed by persistent storage. But persisting data to disk is expensive. Typically, writes are performed to an in-memory buffer, and then periodically (or explicitly on demand) flushed to disk (compare the fsync syscall in posix). To support this pattern, we should change the semantics of our methods to "kindly asking the store to eventually do something", and add a flush method to force an expensive commitment of all prior mutations (typically by fsyncing a write-ahead log).

trait Store {
    async fn ingest(&mut self, entry: Entry) -> Result<(), StoreError>;
    async fn forget(&mut self, entry: Entry) -> Result<(), StoreError>;
    async fn forget_area(&mut self, area: Area3d) -> Result<(), StoreError>;
    async flush(&mut self) -> Result<(), StoreError>;
}

Another typical optimisation is that of efficiently operating on slices of bulk data instead of individual pieces of data. Forgetting should hopefully be rare enough, but we should definitely have a method for bulk ingestion.

trait Store {
    async fn ingest(&mut self, entry: Entry) -> Result<(), StoreError>;
    async fn ingest_bulk(&mut self, entries: &Entry[]) -> Result<(), StoreError>;
    async fn forget(&mut self, entry: Entry) -> Result<(), StoreError>;
    async fn forget_area(&mut self, area: Area3d) -> Result<(), StoreError>;
    async flush(&mut self) -> Result<(), StoreError>;
}

The next issue is one of interior versus exterior mutability. Those async methods desugar to returning Futures. And since the methods take a mutable reference to the store as an argument, no other store methods can be called while such a Future is in scope. Hence, this trait forces sequential issuing of commands, with no concurrency. While it might be borderline acceptable to force linearization of all mutating store accesses (especially since most method calls would simply place an instruction in a buffer rather than actually performing and committing side-effects), the inability to mutate the store while also querying it (say, by subscribing to changes) is a dealbreaker. Thus, we should change those methods to take an immutable &self reference, forcing store implementatons to employ interior mutability. To the experienced rust programmer, this shouldn't be too surprising: the whole point of a store is to provide mutable, aliased state, after all.

trait Store {
    async fn ingest(& self, entry: Entry) -> Result<(), StoreError>;
    async fn ingest_bulk(&self, entries: &Entry[]) -> Result<(), StoreError>;
    async fn forget(&self, entry: Entry) -> Result<(), StoreError>;
    async fn forget_area(&self, area: Area3d) -> Result<(), StoreError>;
    async flush(&self) -> Result<(), StoreError>;
}

Finally, there is a line of thought that I'm less confident about yet. Starting with simple ingest operations and then adding support for buffering (and flushing of the buffer) and bulk operations duplicates a lot of the design of the ufotofu BulkConsumer. So perhaps it would make sense to simply expose a BulkConsumer, whose items are an enum for the different operations (ingest, forget, forget_area). Replacing method calls with explicitly moving around enum values might sound inefficient, but there's an argument to be made that any efficient buffering operation would implement the method calls by storing a reification of the operation inside some buffer anyways. The main upside would be that of using a principled abstraction instead of providing a zoo of methods that end up implementing the same semantics anyways.

Aside from the drawback of forcing the explicit reification of operations, another drawback of the consumer approach would be the question of how to obtain the consumer. If the store itself implemented BulkConsumer, then the issue of exerior mutability would make it unusuable. If the store had a function that took a &self and returned an owned handle that implements BulkConsumer, then arbitrarily many handles could be created, i.e., the consumer would be forced to support many writers. I don't really see a way around that. So the two options seem to either be a multi-writer consumer, or a collection of (interiorly mutable) methods on the store trait without an explicit consumer.

This is the main issue I wanted to convey in this writeup. I have glossed over various details (generics and assocated types, precise errors, information to return on successful operations, an ingest-unless-it-would-overwrite method, payload handling, etc.), because those are comparatively superficial. But I'd love to hear some feedback on the issues of interior vs exterior mutability and explicit vs implicit (buffered, bulk) consumer semantics.

Storage requirements

user-facing
- mutation
- ingest entry
  - optionally do nothing if it would overwrite a (strict) extension of the path
- append to payload prefix
- forget entry by pair of path and subspace id
- forget by area (of interest)
- forget all but area (of interest) (within a containing area (of interest)) (?)
- all forget functionality but for payloads instead of entries
- all forget functionality comes with a traceless flag
  - if traceless, the storage keeps no record of ever having had the data
  - if not traceless, the storage is allowed to persist the forget query for an arbitrary amount of time, in order to accurately inform persistent subscribers about the forgetting in the future
  - non-traceless forgetting does not imply rejecting future data that matches the forget operation, such a service must live at a different level (and must be implemented by not ingesting the data again in the first place)
- flush (force persistence of all prior mutations)
  - one-shot queries may or may not observe mutations before they were flushed
  - subscriptions are only notified of mutations that have been flushed successfully
- query
- query entry by pair of path and subspace id
  - yields Entry, AuthorisationToken, available payload prefix (length)
  - option to filter out results with incomplete payloads
- query entries by area (of interest)
  - all of the functionality of singleton queries, also
  - ordering results (arbitrary, PTS (optionally reversed), TSP (optionally reversed), SPT (optionally reversed)) (?)
- subscriptions: all queries should also work as long-lived subscriptions
  - this includes subscription to payload prefix appending
  - this also includes notifications about overwriting and forgetting things
- persistent subscriptions
  - all subscription notifications come with a u64 progress id, client code can stop consuming a subscription at any time and can later (even between program shutdown and restart) resume it at the same point by supplying its progress id
  - this is a best-effort service, the storage may reject a resumption because it is too outdated
  - a simplistic implementation can effectively opt out of providing persistent subscriptions by simply considering all ids as outdated
  - forgetting entries/payloads can cause progress ids to become outdated, unless the storage tracks the things it forgot
  - this is why all forget functionality comes with a flag whether it should be completely traceless or whether the storage is allowed to track the act of forgetting
internal (for replication)
- all of the public-facing functionality may also be used internally
- query entries by 3dRange
- yields Entry, AuthorisationToken, available payload prefix (length)
- option to filter out results with incomplete payloads (?)
- optionally order according to the ReconciliationAnnounceEntries:will_sort field
- optionally as a subscription (?)
- summarise 3dRange (count, fingerprint)
- convert area of interest to 3dRange
- partition a 3dRange into k roughly evenly sized 3dRanges

earthstar-project / willow-rs

Store trait #21

Design Notes on a Willow Store Trait: Mutation

Storage requirements