This repository owes origins of its design to @RangerMauve's awesome multi-hyperdrive, and of course, the very awesome Hyperbee and the whole of Hypercore.
A LevelUP compatible leaderless multi-master database with eventual consistency, using hyperbee + CRDT + HLC. Similarly CockroachDB achieves replication on top of RocksDB, but here it is a pure P2P streaming database, with zero central management. LevelDB compatibility allows to use Dynalite on top to achieve DynamoDB compatibility with multiple tables, auto-updated secondary indexes, and fairly complex queries combining those indexes. Work on @mhart's Dynalite is almost completed to remove the HTTP server, to make this combination perfect as an embedded database and for serverless scenarios.
Hyperbee is one-of-a-kind steaming database that can change the way we work with databases. But like all other low-level components of hypercore ecosystem it is a single-writer data structure. Multi-writer is a higher-level abstraction, hence the name multi-hyperbee.
const MultiHyperbee = require('multi-hyperbee')
const hyperbeeOpts = { keyEncoding: 'utf-8', valueEncoding: 'json' }
const multiHyperbee = new MultiHyperbee(storage, hyperbeeOpts)
// Each app usually has its own key exchange mechanism with remote peers. So after exchange is completed,
// we will know the keys of the peer's diff feeds. To receive updates from them, you need to add them here. Repeat for all remote peers.
{
await multiHyperbee.addPeer(peerDiffFeedKey)
}
See example here.
creates a new MultiHyperbee with two single-writer hypercores:
{
keyEncoding: 'utf-8' | 'binary' | 'ascii', // or some abstract encoding
valueEncoding: <same as above>
}
customMergeHandler - CRDT handler to apply changes to the Object. If not using default, it should be implemented in a following way
class MergeHandler {
constructor(store) {
this.store = store
}
// It'll apply diff to the correct version of an object
merge(diff) {
}
// That will generate diff based on the last version of the object in the store.
genDiff(oldValue, newValue) {
return diff
}
}
storeValue
- should be a JSON object
Put will write two objects at the same time to Store and to Diff hyperbee. Put will add to each of the objects following properties:
to Store object:
to Diff object
Diff object will be put with the key key/_timestamp
Diff object can be set as a property of the storeValue or it will be generated by MultiHyperbee based on the previous version of the resource in the Store (that still needs to ). If Diff is set as a property of the value it should be added to Object as property _diff. It will be deleted from the Store object and won't be a part of the Store object
Check the diff format here
Diff object that is written to diffHyperbee will have a key that is consists of 2 parts:
Created replica Hyperbee using replica key
removes replica Hyperbee
Use it for writing your own custom merge handler. It creates a union stream from all the replica hyperbees where all the entries are the Diff objects. It runs on each replica hyperbee.
// key has timestamp in it. To limit the search we define the top limit as a key without the timestamp
let lte = key.split('/').slice(0, -1).join('/')
createReadStream({gte: key, lte })
It is used by mergeHandler for applying changes to the Store object when for example:
the rest of API is the same as Hyperbee
diff
and before writing to store
, or after reading from feed
and applying to `store'.In this version we only add multi-writer to Hyperbee. But we can extend it to Trie and Drive. Here are our thoughts on how this might work.
Previous version of the design did not have a diff
feed and thus followed multi-hyperdrive's design more closely. Multi-hyperdrive does not apply updates to the primary, which we did even in the initial version of multi-hyperbee. Instead on each read it checks on the fly which file is fresher and returns that file (It checks in primary and in all the replicas from peers). It also supports on-the-fly merging of the directory listings, without changing any structures on disk. In this design we deviated even further as we needed to support CRDT merging.
To implement CRDT for Hyperdrive files, we might need to change multi-hyperdrive's design to use the diff
feed:
diff
feed, to avoid slowing down DB replication. Hyperdrive uses 2 structures, hypertrie for directory and file metadata and hypercore for data. So changes in hypertrie can be propagated via diff
and. This is a third interation of the design, previous is described and implemented in the release tagged v0.1.
In the prior design we had a primary hyperbee and sparse replicas of other peers' primary hyperbees. In the new design the full object is not replicated to the peers, only its diff (this design eliminated the ping-pong problem of the prior design as the store is not replicated).
In this design we have 2 hyperbees into which we write, store
and diff
. Store
contains a full set of fresh objects (and their older versions, as does any hypercore). Diff
contains only the specially formatted objects (our custom CRDT format) that represent modifications to local objects. Every peer replicates other peers' diff
hyperbees (but not the store
).
Upon diff
hyperbee getting an update() event, we apply the received diff object to the store using the algo below:
For the CRDT algorithm to do its magic, we first rewind to the proper object version and then apply local diffs and a newly arrived remote diff:
This algorithm ensures that all peers have the store in exactly the same state.
Read performance: equals normal hyperbee performance Write performance: quite expensive: