martinsumner / kv_index_tictactree

Provide Active-Anti-Entropy features to a virtual node in a KV store
Apache License 2.0
21 stars 7 forks source link

KV Tictac Tree

Build Status

An Active Anti-Entropy library for Key-Value stores in Erlang.

This is currently a working prototype with basic testing. The target for the library is to be fully integrated with Riak KV for Release 3.0 (Autumn 2018).

The library could in theory be used by any Erlang application wanting to use Merkle trees to compare different data stores, it is designed for Riak but not coupled to Riak. It is not though a general substitute for Merkle trees when the cryptographic strength of Merkle trees is of importance (e.g. a blockchain implementation).

Overview

Library to provide an Active-Anti-Entropy (AAE) capability in a KV store. The AAE functionality is based on that normally provided through Merkle Trees, but with two changes from standard practice:

The purpose of these changes, and other small improvements to standard Merkle tree anti-entropy, are to allow for:

Primary Actors

The primary actor in the library is the controller (aae_controller) - which provides the API to startup and shutdown a server for which will manage TicTac tree caches (aae_treecache) and a parallel Key Store (aae_keystore - which may be empty when run in native mode). The aae_controller can be updated by the actual vnode (partition) manager, and accessed by AAE Exchanges (either directly or also via the vnode manager).

The AAE exchanges (aae_exchange) are finite-state machines which are initialised with a Blue List and a Pink List to compare. In the simplest form the two lists can be a single vnode and partition identifier each - or they could be different coverage plans consisting of multiple vnodes and multiple partition identifiers by vnode. The exchanges pass through two root comparison stages (to compare the root of the trees, taking the intersection of branch mismatches from both comparisons), two branch comparison stages, and then a Key and logical identifier exchange based on the leaf segment ID differences found, and finally a repair.

The AAE exchange should work the same way if two partitions are bing compared, or two coverage queries across multiple partitions are being compared.

More detail on the design can be found here.

Some further background information can be found here.

Using the Library

Following the current tests presently provides the simplest guide to using the library. There is also a mock_kv_vnode process used in these tests, and provides a sample view of how an aae_controller could be integrated.

There are two main branches:

develop-3.1 - default: Target for the Riak 3.1 release with support for OTP 22 and OTP 24;

develop-3.0: Used in the Riak 3.0 release with support for OTP 20 and OTP 22;

develop-2.9: Used in the Riak 2.9 release with support for OTP R16 through to OTP 20.

Contributing and Testing

The acceptance criteria for updating kv_index_tictactree is that it passes rebar3 dialyzer, xref, eunit, and ct with 100% coverage.

To have rebar3 execute the full set of tests, run:

rebar3 as test do xref, dialyzer, cover --reset, eunit --cover, ct --cover, cover --verbose

For those with a Quickcheck license, property-based tests can also be run using:

rebar3 as eqc do eunit --module=aae_eqc

Riak KV

This overview details how the current (Riak KV 2.2.5) AAE implementation works.

This overview details how the target (Riak KV 3.0) AAE implementation is expected to work utilising KV Tictac Trees.