pemrouz / versioned

0 stars 0 forks source link

Versioned Data Discussion/Proposal #1

Open pemrouz opened 8 years ago

pemrouz commented 8 years ago

Versioned Data Discussion/Proposal

TL;DR: This is a first cut of creating persistent, versioned data in the manner of Kafka. This is to eventually support better and more robust replication of data across a distributed environment (e.g. multi-server Ripple) as well as improve UI performance/logic by embracing changelogs. The concept is explained in "Changelog 101: Tables and Events are Dual" [1] and [2] is also a good related video. Code is abstracted from some earlier work and turned into a generic module. I'm sharing this early preview because I'd like to hear whether people think this would be a good/bad idea and if they have any thoughts around this.

In general when we talk about "immutable" data structures, we aren't really interested that it's frozen, but care more about it's quality of persistence - that we can make changes and have an immutable reference to both versions. These are logically however both versions of the same thing, so creating two root references is quite a low-level API that does not reflect this fact. Reasoning in many applications can be simplified if all data/state exhibits this property of storing a reference to their past versions, rather than it having to be manually managed or managed externally. Beyond the UI and one VM in particular, the structural sharing of hash map tries doesn't help us with the efficient and reliable replication of data across environments, which is where the fundamental log/events structure is useful.

This module builds on Immutable.js (for the structurally efficient changelog) and Object.observe (so we can use native objects), to add a log property to any object or array:

var o = versioned([])
// o.log.length == 1

Any changes will result in the changelog being updated as a secondary phenomenon:

o.push('lorem')
// o.log.length == 2

All Array/Object.observe changes are normalized to the tuple { key, value, type (add | update | delete) }. For each record in the log, we store the immutable reference and this diff. Operations like splice may result in multiple entries in the log. Since "table" and "events" are duals, we can also record a fact directly in the log which will update the table (the final value):

set(o)({ key: 1, value: 'ipsum', type: 'add' })
// o == ['lorem', 'ipsum']

We could create more ergonomic API for common operations, like push (where type and key would be inferred). This is largely so it can be easily used in a non-O.o world since it's future now looks uncertain. Alternatives are to implement this using Proxies or a just a custom API.

Objects are also emitterified, so changes on either the log or the object itself will result in an event being emitted with the corresponding change tuple:

o.on('change', replicate)
o.push('dolor')
// replicate will receive { key: '2', value: 'dolor', type: 'add' }

This signature is what Ripple currently uses to stream and replicate changes to be replayed on databases or other servers.

One of the main UI benefits will be to abstract out shouldComponentUpdate logic out of individual components as a function annotation instead (using either !== or check if data.log.length is greater than last time), which mostly looks something look like:

function component(data){
  if (!shouldComponentUpdate(this, data)) return;
  ...

If you're interested in more code, you can check test.js. There's lot of things missing but I'd be interested to know what you think of the general concept.. //cc @leebyron @jkimbo @sammyt @mstade @mamapitufo @aaronhaines @gabrielmontagne @mere @tomsugden @lukestephenson

[1] The Log: What every software engineer should know about real-time data's unifying abstraction [2] Turning the Database Inside Out

mstade commented 8 years ago

This is largely so it can be easily used in a non-O.o world since it's future now looks uncertain.

Nah man, it's pretty certainly dead. :o)

jkimbo commented 8 years ago

Interesting proposal! Couple of thoughts:

Interested to hear your views on this.

pemrouz commented 8 years ago

Ok, managed to make a lot more refinements after folding this into a new app:

Couple of related future ideas that would be nice to explore/incorporate: