basho / riak_core

Distributed systems infrastructure used by Riak.
Apache License 2.0
1.23k stars 391 forks source link

Fix claim tail violations and unbalanced rings #913

Closed russelldb closed 6 years ago

russelldb commented 6 years ago

This PR has some fixes for riak_core claim. There is a full write up in docs/claim-fixes.md.

In brief: In some situations riak_core_claim can produce rings that have tail violations. A tail violation is where some preflists (drawn from the tail of the ring) do not enforce the target-n-val property. One example is ring using the old claim diag for 16 vnodes and 5 nodes. Such a ring would have a tail of 1. The last two preflists from that ring would be [5, 1, 1] and [1, 1,2] both of which would mean >1 replica on a single node.

A further issue with claim was that distributing existing claim to a new node could result in an unbalanced ring. An example would be adding a 5th node to a 32 vnode ring of 4 nodes. The distribution went from [8, 8, 8, 8] to [6, 6, 6, 8, 6] . Before assigning vnodes claim calculates "deltas" as a measure of how many vnodes each node wants or should give up. Prior to the change, for this example the deltas were [2, 2, 2, 2, -6] which means "you can 2 partitions from each of 4 nodes in order to provide 6 on the 5th". Claim would be greedy and take all 2 from each of the first 3 nodes and stop. This fix balances the deltas before claim so that they always equal the wants of the claiming node(s). For the example the deltas would be [2, 2, 1, 1, -6] ensuring that the final balance of vnodes will be [6, 6, 7, 7, 6]

russelldb commented 6 years ago

riak_tests pass. Dialyzer passes. make test passes. FOR ME anyway.

russelldb commented 6 years ago

riak_tests pass. Dialyzer passes. make test passes. FOR ME anyway.

nickkeers commented 6 years ago

I got a couple of test errors from running make test to do with the btypes tests from brops - i merged in this pr and that fixed the failing tests. All tests passing!

Dialyzer also passes

russelldb commented 6 years ago

Is that +1 @nickkeers ?

nickkeers commented 6 years ago

Not from me i'm afraid, I don't think I know enough about riak_core to comment on this one sadly. But I wanted to mention I've ran it and tests pass etc

ghost commented 6 years ago

@russelldb @nickkeers

  1. Created a branch on my riak fork - develop-2.2-test-rdb-ramen-claim-rebase - based off latest develop-2.2.
  2. Tweaked rebar.config to pull in this branch of riak_core - and forced riak_repl_pb_api to version 2.0.6 (build failed otherwise) :
{riak_repl_pb_api, ".*", 
       {git, "git@github.com:basho/riak_repl_pb_api.git", {tag, "2.6.0"}}},
{riak_core, ".*", 
       {git, "git://github.com/nhs-riak/riak_core.git", {branch, "rdb/ramen-claim-rebase"}}},
  1. Created a three node cluster - added and removed nodes randomly - everything worked correctly - and partition distribution remained balanced.
  2. Created a 8 node cluster - added nodes to cluster in a random order, worked correctly - partition distribution remained balanced.
  3. Ran quickcheck tests from rdb/ramen-claim-rebase branch - failed at the end because I need to rebuild Erlang from source - erlang crypto fun, fun, fun.
  module 'riak_core_security_tests'
    riak_core_security_tests: security_test_ (trust auth works)...Assertion failed: (ctx), function digest_update, file /BuildRoot/Library/Caches/com.apple.xbs/Sources/boringssl/boringssl-109.40.1/apple/crypto/digests.c, line 49.
make: *** [test] Abort trap: 6

+1 from me