turbofish-org / merk

High-performance Merkle key/value store
Apache License 2.0
226 stars 36 forks source link

Relative key encodings for child references #36

Open mappum opened 4 years ago

mappum commented 4 years ago

Each node must contain the keys of its children in order to be able to traverse the tree. This adds significant overhead compared to the actual data itself, especially when keys are long.

Since a node's children's keys will often be very similar to it's own and only differ at the last few bytes, we can cheaply shave down that data by instead storing keyA XOR keyB, removing leading zeroes. When traversing, we can expand into the full key by XORing once again with the parent node's key.

We also have to handle keys which are a different length, and this can be done with a second vector of additional bytes which are not part of the XOR.

mappum commented 4 years ago

This gives us the space-saving benefits of a trie without relying on random orderings to balance the tree.

We can also use this compact encoding in generated proofs in order to make them significantly smaller.