basho / riak_dt

Convergent replicated datatypes in Erlang
Apache License 2.0
353 stars 70 forks source link

Counters in Maps cannot be `reset-remove` #98

Open russelldb opened 10 years ago

russelldb commented 10 years ago

The DT Map provides a reset-remove semantic for field removal.

When a field is removed from the Map, it is the same as

For example remove Set at X. If the Set contained [bob, joe, sally, ann] then the observed dots for those entries would be removed when the set is removed. A concurrent update to the field add sue to Set at Z would result in the field X being present (Add-Wins semantic) but the the value being only [sue] as the remove of the field counts as a remove of all its values.

However, with a counter this is not currently possible. For space efficiency reasons we do not store a dot-per-increment of a counter. This would lead to a truly enormous set of increments for any reasonable active counter. Instead a counter is modelled as

[{actor, dot, increment, decrement}]

When an actor increments the counter it reads the entry for actor, adds to either increment or decrement and sets the dot to the current event. When two counter fields merge, the largest dot is kept. This way a counter field that is removed, then re-added will not be overwritten by it's previous value on merge (if the previous value was larger, see docs/code of PN-Counter for details of counter merge behaviour (basically a per-actor-max.))

This design is necessary to support removal of counter fields. However there is the following issue:

The original increment of A is not reset when B removes the field since it is "absorbed" in the second increment by A.

We can explain this away by imagining a disconnected file system (like say DropBox.) On one machine a user has removed a file (the counter), and another they have edited a file (the counter.) When systems sync up, you'd expect the file to be present, and the edited file to be intact, not just the diff of the edit.

Possible fixes: