helium / plumtree

Epidemic Broadcast Trees
Apache License 2.0
193 stars 51 forks source link

Concurrency problems with state update can lead to membership data loss #18

Open cmeiklejohn opened 9 years ago

cmeiklejohn commented 9 years ago

The attempt_join process uses the externally exported update_state function to write the state back to the ETS table; however, a race condition can occur where the state in the ETS table has changed which will be overwritten when persisted back to the ETS table.

What really needs to happen is that attempt_join needs to submit it's own changes; the metadata manager needs to act in serial and compute the merge, or join, itself before binding the state change. This is the only safe way to compute this join and is how Lasp deals with concurrent edits to the same CRDT at a single replica.

cmeiklejohn commented 9 years ago

I've got a patched version of this in the Lasp fork of Plumtree.

cmeiklejohn commented 9 years ago

Addressed by https://github.com/helium/plumtree/pull/23.