basho / riak_dt

Convergent replicated datatypes in Erlang
Apache License 2.0
353 stars 70 forks source link

LWW Registers in Maps may go "backwards" in time #100

Closed lenary closed 7 years ago

lenary commented 10 years ago

The merging behaviour around LWW and the Map's reset-remove semantic can produce some behaviour where the value of a LWW register may go back to a value with a previous timestamp.

So, to setup the scenario, imagine 2 writing nodes, A and B.

  1. Two things concurrently happen:
    a. B writes <b> into a register inside a map, with ts 0.
    b. concurrently, A writes <a> into the same register inside a map, with ts 1.
  2. A replicates to B. i.e. B := merge(B, A)
  3. A removes the register inside the map, at ts >1.
  4. A and B merge, so A := merge(A,B) and B := merge(B,A) (merge is commutative, so order of arguments doesn't actually produce any difference in result).

In Riak 2.0, the final value of the register inside the map on both nodes is <b>. However, if we examine the history of the register at B, we can see something peculiar. After step 1, the value is evidently <b> with ts 0. After step 2, the value must be <a> with ts 1, as it has a higher timestamp. However, after step 4, the value is back to <b> with ts 0. The timestamp has decreased, which should never happen.

The root cause of this is the mixture of the causal ordering (used by the map) and the timestamp ordering (used by the register inside the map). We are currently investigating fixes, but it looks likely that we will replace the lww-register (which only uses timestamps for value resolution) with a causal-lww register, which uses causal order as much as possible for value resolution, but falls back on timestamps when there are concurrent edits. This is very much like how riak_object works with allow_mult=false.

Unfortunately, due to the time required to develop and test a causal-lww register, it seems unlikely we will get this fixed before 2.0 comes out.

russelldb commented 10 years ago

Hey, I don't think it needs "fixing" for 2.0. It's a known issue. Lets add it to the release notes and think about it for future maps. I think the Map we are releasing behaves quite well wrt to LWW reg, since it keeps dot->reg mappings for concurrent writes and only merges on update. The new 'Paulo' Map is more problematic since it merges-on-merge and values "escape" earlier.

There is no well defined semantic for this, in the literature or in general use, so what we go with becomes the state-of-the-art until we/someone gets smarter.