mafintosh / hyperdb

Distributed scalable database
MIT License
753 stars 75 forks source link

feature: db.createDiffStream() #12

Closed hackergrrl closed 6 years ago

hackergrrl commented 6 years ago

An API that lets you compare the current state of a hyperdb to an earlier state. For example:

var hyperdb = require('hyperdb')
var ram = require('random-access-memory')

var db = hyperdb(ram, {valueEncoding: 'json'})

db.put('/a/foo', 'quux', function (err) {
  db.snapshot(function (err, at) {
    db.put('/a/foo', 'baz', function (err) {
      var rs = db.createDiffStream('/a', at)
      rs.on('data', console.log)
    })
  })
})

outputs

{ type: 'del', name: '/a/foo', value: 'baz' },
{ type: 'put', name: '/a/foo', value: 'quux' }

You can get a diff on any key prefix, much like db.watch().

This also exposes another API function, db.snapshot(), which captures the feedID/seq tuples at the point in time that it's called. I'm not sure whether this is the optimal way to expose this (maybe db.put() could return a snapshot each time?). This function is probably the least well thought out.

My motivation for this was thinking about building views or indexes on top of a hyperdb, not unlike hyperlog-index or flumeview-level.

This is a basic implementation, and has caveats:

  1. Results are accumulated and then written to the stream after a full diff. I bet this can be done smarter and emitted in realtime.
  2. All diffs are between a snapshot and HEAD (now). It would be nice if you could pass in two snapshots via an opts or something.
  3. Live streaming would be great! Right now it's from two static, well-defined points in time.

Cheers! :tada:

mafintosh commented 6 years ago

1.4.0!!!!!