This is a meta-issue to track the remaining strong consistency work for Riak 2.0.
[x] There is currently a manual step that must be performed from the Erlang console to enable strong consistency. This step must be performed on one and only one node in the cluster. This should be changed to be automated, or exposed via a new riak-admin command. (basho/riak_core#571)
[x] Nodes cannot safely be removed from a cluster that uses strong consistency. (basho/riak_core#572)
[X] Using strong consistency on slow disks w/o reducing ring size is likely to lead to 100% CPU/disk usage and poor performance. (basho/riak_ensemble#15)
[x] There are no user-facing riak-admin commands to inspect the state of the consensus system. (basho/riak_ensemble#9)
[x] There are no stats for consistent K/V operations. (basho/riak_kv#876)
[x] Pending ensemble peers are trusted by default, which makes Riak vulnerable to node failures while ownership changes are occurring (basho/riak_ensemble#17)
[x] The current AAE-based ensemble syncing approach used by riak_kv is more sensitive to node failures / network partitions than it should be (basho/riak_kv#908)
[x] Riak K/V ensemble data is hardcoded to never be trusted. We should make this a configurable setting. Then, users that trust their disks to not silently lose data can chose to switch to trust-by-default in order to need fewer online replicas (basho/riak_kv#909)
[x] Several consensus subsystem related settings are hardcoded, but should instead be configurable for advanced users / support scenarios (basho/riak_ensemble#18)
[x] Writes to consistent bucket types should fail fast if strong consistency is actually disabled. Currently, riak_client will attempt to use riak_ensemble_client which will error when consensus is disabled. (basho/riak_kv#713)
[x] K/V ensemble peers can start up before riak_kv is ready, leading to various issues (basho/riak_kv#984)
[x] K/V ensemble peers send messages to the vnode proxy without accounting for the fact that the proxy and/or vnode may crash and drop the message on the floor (basho/riak_kv#985)
[x] K/V ensemble peers send messages to the vnode proxy without accounting for the fact that the proxy may drop messages during an overload situation (basho/riak_kv#986)
[x] Ensemble leaders do not step down when they fail a local put, they should (basho/riak_ensemble#27)
[x] Ensemble leaders do not step down when they fail a local get, they should (basho/riak_ensemble#30) (Not an issue, closed)
[x] Riak does not gracefully handle the case where consensus is not enabled the same across the entire cluster, this should be better handled since mixed configuration is a necessary evil during a rolling configuration change (basho/riak#559)
[x] The ensemble manager does not guarantee state is saved when enabling consensus, which can lead to a potential race condition (basho/riak_ensemble#34)
[x] New integrated integrity checking (basho/riak_ensemble#37)
[x] A small change to check leader lease after reads is necessary to guarantee safety. Also, we should make the leader-only read configurable, in case we have other safety options -- eg. user can configure Riak to always do quorum reads instead. Consider it a "get out of jail" card. (basho/riak_ensemble#41)
Issues that have been punted to at least 2.0.x (if not 2.1):
[ ] We incorrectly commit views against the joint quorum, when we only need to commit against the initial view. This is only a minor issue, but worth fixing. (basho/riak_ensemble#3)
Issues that have been punted to 2.1:
[ ] Dynamic ring resizing (esp. shrinking) is not safe in clusters that use strong consistency (basho/riak_kv#900)
This is a meta-issue to track the remaining strong consistency work for Riak 2.0.
riak-admin
command. (basho/riak_core#571)riak-admin
commands to inspect the state of the consensus system. (basho/riak_ensemble#9)riak_kv
is more sensitive to node failures / network partitions than it should be (basho/riak_kv#908)riak_client
will attempt to useriak_ensemble_client
which will error when consensus is disabled. (basho/riak_kv#713)riak_kv
is ready, leading to various issues (basho/riak_kv#984)Issues that have been punted to at least 2.0.x (if not 2.1):
Issues that have been punted to 2.1: