tarantool / vshard

The new generation of sharding based on virtual buckets
Other
100 stars 31 forks source link

Prevent doubled buckets caused by manual `bucket_send` #414

Closed Serpentian closed 1 year ago

Serpentian commented 1 year ago
  1. If manual vshard.storage.bucket_send() is used. Imagine this situation: storage S1 has bucket B. It is sent to S2, but then connection broke. The bucket is in the state S1 {B: sending-to-s2}, S2 {B: active}. Now if the user will do vshard.storage.bucket_send(S2 -> S3), then we will get this: S1 {B: sending-to-s2}, S2: {}, S3: {B: active}. Now when recovery fiber will wakeup on S1, it will see that B is sending-to-s2 but S2 doesn't have the bucket. Recovery will then assume that S2 already deleted B, and will recover it on S1. Now we have S1 {B: active} and S3 {B: active}. This is quite easy to achieve if one uses manual vshard.storage.bucket_send(). When automatic rebalancing is used, it shouldn't happen normally, because the rebalancer won't try to apply new routes while there are sending or receiving buckets anywhere.

  2. The assumption in the last sentence in [1] might break probably due to master changes. Imagine that a replicaset had replicas R1 {= master} and R2. They lost connection between each other. R1 has a bucket which is sending/receiving but is already recovered to active state in another replicaset. Now R1 is demoted and R2 becomes master. It doesn't have this bucket in any state. And rebalancer will happily apply new routes. Then we might get the same situation as in [1]. Or something like that.

Maybe the fix would be as simple as to make vshard.storage.bucket_send() turn local bucket to SENT and only then the remote bucket to ACTIVE. Currently it is vice versa. Then [1] becomes not possible. If connection would be broken, then the local SENT bucket would be deleted eventually, and the remote RECEIVING would be turned to ACTIVE by recovery when the connection is back.

Proposed by @Gerold103 in https://github.com/tarantool/vshard/issues/412