earthstar-project / willow-rs

Implementations of the Willow family of specifications in Rust.
https://willowprotocol.org
Apache License 2.0
7 stars 0 forks source link

Store trait #21

Open sgwilym opened 2 weeks ago

sgwilym commented 2 weeks ago

Work in progress, currently being discussed and iterated upon.

AljoschaMeyer commented 1 week ago

Design Notes on a Willow Store Trait: Mutation

For our rust implementation of Willow, we are designing a trait to abstract over data stores. Among the features that a store must provide are ingesting new entries, querying areas, resolving payloads, and subscribing to change notifications. Turns out this trait becomes rather involved. In this writeup, I want to focus on a small subset of the trait: everything that allows user code to mutate a data store.

On the surface, Willow stores support only a single mutating operation: ingesting new entries. Does the following trait (heavily simplified on the type-level) do the trick?

trait Store {
    async fn ingest(&mut self, entry: Entry) -> Result<(), StoreError>;
}

Nope, there is actually a whole lot more to consider. We'll start with something simple: while the data model only considers adding new entries to a store, there is a second operation that all implementations should support: locally deleting entries. We want to support this both for explicit entries, and for whole areas:

trait Store {
    async fn ingest(&mut self, entry: Entry) -> Result<(), StoreError>;
    async fn forget(&mut self, entry: Entry) -> Result<(), StoreError>;
    async fn forget_area(&mut self, area: Area3d) -> Result<(), StoreError>;
    // We also need ingestion and forgetting of payloads,
    // but we'll leave that for later.
}

Now we have a sketch of a somewhat useable API, but it does not admit particularly efficient implementations. We want stores to (potentially) be backed by persistent storage. But persisting data to disk is expensive. Typically, writes are performed to an in-memory buffer, and then periodically (or explicitly on demand) flushed to disk (compare the fsync syscall in posix). To support this pattern, we should change the semantics of our methods to "kindly asking the store to eventually do something", and add a flush method to force an expensive commitment of all prior mutations (typically by fsyncing a write-ahead log).

trait Store {
    async fn ingest(&mut self, entry: Entry) -> Result<(), StoreError>;
    async fn forget(&mut self, entry: Entry) -> Result<(), StoreError>;
    async fn forget_area(&mut self, area: Area3d) -> Result<(), StoreError>;
    async flush(&mut self) -> Result<(), StoreError>;
}

Another typical optimisation is that of efficiently operating on slices of bulk data instead of individual pieces of data. Forgetting should hopefully be rare enough, but we should definitely have a method for bulk ingestion.

trait Store {
    async fn ingest(&mut self, entry: Entry) -> Result<(), StoreError>;
    async fn ingest_bulk(&mut self, entries: &Entry[]) -> Result<(), StoreError>;
    async fn forget(&mut self, entry: Entry) -> Result<(), StoreError>;
    async fn forget_area(&mut self, area: Area3d) -> Result<(), StoreError>;
    async flush(&mut self) -> Result<(), StoreError>;
}

The next issue is one of interior versus exterior mutability. Those async methods desugar to returning Futures. And since the methods take a mutable reference to the store as an argument, no other store methods can be called while such a Future is in scope. Hence, this trait forces sequential issuing of commands, with no concurrency. While it might be borderline acceptable to force linearization of all mutating store accesses (especially since most method calls would simply place an instruction in a buffer rather than actually performing and committing side-effects), the inability to mutate the store while also querying it (say, by subscribing to changes) is a dealbreaker. Thus, we should change those methods to take an immutable &self reference, forcing store implementatons to employ interior mutability. To the experienced rust programmer, this shouldn't be too surprising: the whole point of a store is to provide mutable, aliased state, after all.

trait Store {
    async fn ingest(& self, entry: Entry) -> Result<(), StoreError>;
    async fn ingest_bulk(&self, entries: &Entry[]) -> Result<(), StoreError>;
    async fn forget(&self, entry: Entry) -> Result<(), StoreError>;
    async fn forget_area(&self, area: Area3d) -> Result<(), StoreError>;
    async flush(&self) -> Result<(), StoreError>;
}

Finally, there is a line of thought that I'm less confident about yet. Starting with simple ingest operations and then adding support for buffering (and flushing of the buffer) and bulk operations duplicates a lot of the design of the ufotofu BulkConsumer. So perhaps it would make sense to simply expose a BulkConsumer, whose items are an enum for the different operations (ingest, forget, forget_area). Replacing method calls with explicitly moving around enum values might sound inefficient, but there's an argument to be made that any efficient buffering operation would implement the method calls by storing a reification of the operation inside some buffer anyways. The main upside would be that of using a principled abstraction instead of providing a zoo of methods that end up implementing the same semantics anyways.

Aside from the drawback of forcing the explicit reification of operations, another drawback of the consumer approach would be the question of how to obtain the consumer. If the store itself implemented BulkConsumer, then the issue of exerior mutability would make it unusuable. If the store had a function that took a &self and returned an owned handle that implements BulkConsumer, then arbitrarily many handles could be created, i.e., the consumer would be forced to support many writers. I don't really see a way around that. So the two options seem to either be a multi-writer consumer, or a collection of (interiorly mutable) methods on the store trait without an explicit consumer.

This is the main issue I wanted to convey in this writeup. I have glossed over various details (generics and assocated types, precise errors, information to return on successful operations, an ingest-unless-it-would-overwrite method, payload handling, etc.), because those are comparatively superficial. But I'd love to hear some feedback on the issues of interior vs exterior mutability and explicit vs implicit (buffered, bulk) consumer semantics.

AljoschaMeyer commented 1 week ago

Storage requirements