heapwolf / level-replicator

WIP; Eventually consistent log-based multi-master replication for levelDB (@Level)
MIT License
70 stars 11 forks source link

SYNOPSIS

Eventually consistent log-based multi-master replication for leveldb.

MULTI MASTER EXAMPLE

example with UDP peer discovery (see below): peers object can be let empty

Server 1 instance 1

var level = require('level')
var replicate = require('level-replicator')

var levelConfig = { // level configuration object }

// default settings
var replicationConfig = { port: 9000, host: '127.0.0.1', peers: {} }

var db = replicate(level('/tmp/db', levelConfig), replicationConfig)

// put something into the database
db.put('some-key', 'some-value', function(err) {
})

Server 1 instance 2

var level = require('level')
var replicate = require('level-replicator')

var levelConfig = { // level configuration object }

// different port
var replicationConfig = { port: 9001, host: '127.0.0.1', peers: {} }

// different db folder
var db = replicate(level('/tmp/db2', levelConfig), replicationConfig)

db.put('some-key', 'some-value', function(err) {
})

Server 2

var level = require('level')
var replicate = require('level-replicator')

var levelConfig = { // level configuration object }

// any port
var replicationConfig = { port: 9000, host: '127.0.0.1', peers: {} }

var db = replicate(level('/tmp/db'))

db.put('some-key', 'some-value', function(err) {
})

db.on('connect', function(host, port){ console.log('connect', host, port) })
db.on('connection', function(host, port){ console.log('connect', host, port) })
db.on('error', function(err){ console.log('error', err) })

Server 3...

var level = require('level')
var replicate = require('level-replicator')

var levelConfig = { // level configuration object }

// any port
var replicationConfig = { port: 9000, host: '127.0.0.1', peers: {} }

var db = replicate(level('/tmp/db'))

db.put('some-key', 'some-value', function(err) {
})

REPLICATION ALGORITHM

REPLICATON CONFLICTS

Before a local database can accept writes, it must attempt to replicate. This will reduce the possibility for conflicts. However, in the eventual consistency model, there is a case in which conflicts can occur. Conflicts happen when two or more writes with the same key and logical clock value are written to two or more servers, for example...

Which write happened first? There is no reliable way to know. If this is a possibility for you, a resolver can be used to determine which write should be accepted. A resolver is a function can be passed into the configuration. The resolver function should return true to accept the remote value or false to reject it.

{ resolver: function(a, b) { return a.timestamp > b.timestamp; } }

PEER DISCOVERY

Server lists are a suck to maintain. They also don't work well in auto-scaling scenarios. level-replicator can use UDP multicast to discover peers that it will replicate with.

However not all VPCs support multicast and not all replication scenarios will be within the same subnet, you may want to add known servers to a configuration, for instance...

{ peers: ['100.2.14.104:8000', '100.2.14.105:8000'] }

Otherwise you can use a service registry like [seaport]() or an module like [aws-instances]() to feed the peers member of the options object.

FAQ

Q. Why not expose the tcp connection so I can pool it / manage it myself?

A. Connections are not long lived, a server connects, has a conversation and then disconnects.

Q. Why not allow me to manage the connection protocol?

A. Give me a good use case.