firedancer-io / firedancer

Firedancer is Jump Crypto's Solana consensus node implementation.
https://firedancer.io
Other
763 stars 127 forks source link

Persistent inter-process shared bplus tree code generator #2236

Closed kbowers-jump closed 4 days ago

kbowers-jump commented 4 days ago

(By popular demand from a number of different people, including most recently Liam.)

From the templated code documentation:

Generates prototypes, inlines and/or implementations for an ultra high performance bplus-tree-based key-val store. A bplus tree can be persisted beyond the lifetime of creating process, used concurrently, used IPC, relocated in memory, naively serialized/deserialized and/or moved between hosts. Virtually all operations on a bplus tree are a fast O(lg N) (where N is the number of elements stored) or better in worst case.

At its core, this is fast binary search on a sorted array. But the sorted array has been partition into leaves where each leaf is responsible for a continuous disjoint portion of the key space and union of the ranges covered by the leaves covers the entire key space. The leaves are stored in a tree whose nodes have a large and flexible number of branches per each node that specify how leaves completely partition key space. Further, to support fast forward and reverse iteration, the leaves are organized into a sorted doubly linked list. Lastly, the interior nodes and leaves are guaranteed to be full enough that query has a fast O(lg N) worst case and have enough slack that insert / upsert / remove also have fast O(lg N) worst case.

This has a number of improvements over textbook bplus trees, including: