basho / riak_dt

Convergent replicated datatypes in Erlang
Apache License 2.0
353 stars 70 forks source link

Reset-remove semantic on Map #93

Closed russelldb closed 10 years ago

russelldb commented 10 years ago

With a shared causal context for the map and the CRDTs it embeds, removing a field, and then re-adding it has the effect of removing all the additions to that field when merging with a concurrent update to the field. For example:

Add 'X' to set in field F at A
Merge with B
Add 'Y' to set in field F at B
remove field F at A
Add 'Z' to field F at A
merge A and B

The set contains only 'Y' and 'Z', since the remove of the field at A containing 'X' constitutes a remove of 'X'.

However, if the field was not re-added by A the semantic was different, and the result would contain 'X, Y, Z'. This seemed counter-intuitive and confusing. These changes add a temporary tombstone to present fields, so that the removal of a field effects a removal of all dots seen by the field at removal time. But you don't have to re-add the field to see this effect. Neat.

One intuitive way to think of it as like an offline sync of a file system like Dropbox. With this semantic, removing a directory on one machine, and adding a file to that directory on another, merges to the directory being present, with only the added file in it.

There is some oddness with counters, though.

I added a new counter type, riak_dt_emcntr (see it's docs for more details) that stores a dot as latest event against each actors increments and decrements. This is so a field removal can reset the counter value. However, we don't store a dot per increment (that would be crazy inefficient). Counters work will in this scenario

A increments counter at field F by 10
B merges A
A removes field F
B increments counter at field F by 5
A and B merge

Counter is at 5 (A's dot for the increment is dropped)

However there is an anomaly

A increments counter at field F by 10
B merges A
A increments counter by 5
B removes field F
A and B merge

The counter is at 15.

Now, technically, A's first increment should be dropped (should it??) and only the second stand, but that does not happen, as we don't keep a dot per increment. This, too, can be spun in the filesystem sense. If you updated a file and concurrently removed it, you wouldn't expect only the diff to survive, but the whole file.

Personally I'm happy enough with the semantic, to enable reset-remove on counters requires a dot per increment (a counter as set of increments) and that is not worth the overhead.

Happy to rebase once the full round of reviewing is done.

OOOOOOOH, And I removed add from the Map API. Problem?

russelldb commented 10 years ago

This is kind of WIP, in that there is still W to do. But I thought it prudent to get eyes on early.

Still todo:-

seancribbs commented 10 years ago

Can you rebase or merge develop into this so it will merge cleanly (and make borshop happy)?

russelldb commented 10 years ago

After the review I can.

seancribbs commented 10 years ago

Is it ok that the map_eqc tends to have 0 as the most common actor count? Ran it for 6 minutes and got this output:

19.94% 0
11.44% 4
8.55% 3
8.31% 2
8.15% 6
[...]
russelldb commented 10 years ago

@seancribbs good question, I need to look at that some more. I mean, that's a high percentage, I think I need further metrics to see what size the map is, 'cos I suspect that there is still an issue with the state for this test. Luckily I have some expert eyes tomorrow.

seancribbs commented 10 years ago

@russelldb I've rebased your branch, cleaning up a duplicate commit and grouping similar work. To get your local copy working:

  1. git stash any unsaved changes you have
  2. git fetch origin to get the latest head of the branch
  3. git reset --hard origin/feature-bug/rdb/removes-remove to move your local branch to the rebased one
  4. git stash apply to reapply your local changes

Ping me if you have trouble with this.

russelldb commented 10 years ago

OK. Addressed review comments and pushed. Just going to remove add from kv api, then ready for final review.

russelldb commented 10 years ago

Removed "add" from PB and HTTP APIs https://github.com/basho/riak_pb/pull/93, https://github.com/basho/riak_kv/pull/951

seancribbs commented 10 years ago

:+1: ffce004

seancribbs commented 10 years ago

@borshop merge