basho / riak_kv

Riak Key/Value Store
Apache License 2.0
650 stars 233 forks source link

Make ring resizing work with riak_ensemble [JIRA: RIAK-1652] #900

Open jtuple opened 10 years ago

jtuple commented 10 years ago

Currently, riak_ensemble does not support deleting ensembles -- only creating them. We should add support for deleting ensembles to riak_ensemble, as well as update riak_kv_ensembles and riak_kv_ensemble_backend to delete ensembles if the ring size shrinks. Without this change, a cluster using strong consistency will break if dynamic ring resizing is used to shrink the ring. On the other hand, growing the ring should be safe -- although, we should test this.

We either need to fix this before shipping 2.0, or decide to not support ring resizing for strongly consistent clusters until a later release.

/cc basho/riak#536

jonmeredith commented 10 years ago

Any idea on how much work? May have to punt on this post-2.0.

jrwest commented 10 years ago

i vote punt. yokozuna doesn't support either.

jtuple commented 10 years ago

Punted to 2.1

randysecrist commented 10 years ago

What are the implications of the break? How will this impact customers who grow and shrink clusters on a regular basis?

reiddraper commented 10 years ago

@randysecrist

How will this impact customers who grow and shrink clusters on a regular basis?

This doesn't affecting shrinking and growing a cluster by adding/removing nodes. Just changing ring size, which is still an experimental feature, afaik.

nickelization commented 9 years ago

Just commenting to note that this is still an issue in the latest code. I discovered this problem completely by accident when I noticed extremely high CPU usage coming from riak_ensemble on my dev setup. Long story short, it turned out that I had previously been running at a larger ring size, and hadn't cleared out the ensemble data when I wiped my cluster data. So it continued trying to start ensembles for non-existent vnodes, which ended up getting stuck in loops of continuously trying to monitor a vnode process and then getting the {'DOWN',...} message and starting over again.