n0-computer / iroh

A toolkit for building distributed applications
https://iroh.computer
Apache License 2.0
2.34k stars 148 forks source link

Willow store #2693

Open rklaehn opened 3 weeks ago

rklaehn commented 3 weeks ago

After a lot of discussion with Aljoscha last week I have decided to implement the willow store roughly as proposed in https://github.com/AljoschaMeyer/kv_3d_storage

TLDR: this has the upside that it is possible to quickly query 3d ranges, as well as (this is the point where it is superior to the radix tree) querying 3d ranges sorted by any of the 3 primary dimensions path, time and subspace.

The downside compared to the radix tree is that insertion requires roughly O(log n) path comparisons, and path comparisons itself are not constant time and can be expensive. Also, the paths are stored as a whole, unlike in a radix tree where they are prefix compressed. Prefix deletion will need a naive implementation where you just iterate and delete over everything below a prefix.

My current approach is to implement the tree query ops for a generic X, Y, Z (currently using u64 for testing), then implement insert and delete. Operations are tested using proptest property based testing.

I am currently working on this in a separate repository https://github.com/n0-computer/willow-store/tree/3d

Operations

Plumbing

Docs

matheus23 commented 3 weeks ago

Prefix deletion will need a naive implementation where you just iterate and delete over everything below a prefix.

Doesn't this have to happen on every insert? I.e. prefix deletion is exactly the same as inserting a tombstone at a prefix?

rklaehn commented 3 weeks ago

Doesn't this have to happen on every insert? I.e. prefix deletion is exactly the same as inserting a tombstone at a prefix?

Not sure what you mean with the tombstone, but yes, every insert will mean a query for possibly affected child nodes, iterating over them, and deleting those that have to go.

~Two~three reasons for why this might not be quite as bad as it sounds:

AljoschaMeyer commented 3 weeks ago

Another thought to keep in mind about naive prefix deletion vs more efficient radix-tree-based implementations: notifying subscribers about deletions might turn this into an O(n) (for n deleted entries) anyways. O-notation might be less important here than actual running times, but still, could be helpful to keep this in mind from the start.