tarantool / vshard

The new generation of sharding based on virtual buckets
Other
99 stars 30 forks source link

storage: partly fix doubled buckets #415

Closed Serpentian closed 1 year ago

Serpentian commented 1 year ago

Currently buckets with 'active' status can appear on several shards in case of manual use of vshard.storage.bucket_send() in the following situation: storage S1 has bucket B, which is sent to S2, but then connection broke. The bucket is in the state S1 {B: sending-to-s2}, S2 {B: active}. Now if the user will do vshard.storage.bucket_send(S2 -> S3), then we will get this: S1 {B: sending-to-s2}, S2: {}, S3: {B: active}. Now when recovery fiber will wakeup on S1, it will see that B is sending-to-s2 but S2 doesn't have the bucket. Recovery will then assume that S2 already deleted B, and will recover it on S1. Now we have S1 {B: active} and S3 {B: active}, doubled buckets situation.

Let's fix that by turning local bucket to SENT and only after that the remote bucket to ACTIVE. It's safe to do, as local bucket will be garbade collected and the remote one will be turned to ACTIVE by recovery process.

Closes #414

NO_DOC=bugfix

Serpentian commented 1 year ago

The test storage_1_1 fails on the master branch too, created ticked for that

Serpentian commented 1 year ago

I found a bunch of tests, which became flaky after that patch, I'll convert it to draft until they're fixed

Serpentian commented 1 year ago

Also please double-check yourself that my idea is feasible at all. Maybe I was wrong somewhere.

The test storage_1_1_1 reproduces exactly what you said: without this patch we get doubled buckets.