tarantool / vshard

The new generation of sharding based on virtual buckets
Other
100 stars 30 forks source link

bucket garbage collection does not work correctly with MVCC enabled (and maybe even w/o it) #314

Closed CuriousGeorgiy closed 2 years ago

CuriousGeorgiy commented 2 years ago

Description

Bucket garbage collection assumes the fact that each increment to M.bucket_generation in the on_replace trigger bucket_generation_increment for box.space._bucket corresponds to a tuple with consts.BUCKET.GARBAGE or consts.BUCKET.SENT status that will be seen when querying box.space._bucket in the bucket garbage collection loop: this assumption does not hold with MVCC enabled — the transaction manager can show versions of tuples before the replace. It may also not hold even by default (with MVCC disabled): the replace which changed the bucket status could not yet get committed to the WAL at the time the bucket garbage collection fiber queries box.space._bucket — this hypothesis was not tested and requires further investigation.

Reproducers

Behaviour

The following console output (with custom log messages inserted for clarity) aims to demonstrate the system's behavior with MVCC enabled and the reproducer above:

sA-1 | 2022-01-14 15:28:24.001 [84516] main/161/vshard.rebalancer_worker_1 I> in rebalancer before last bucket replace
sA-1 | 2022-01-14 15:28:24.001 [84516] main/161/vshard.rebalancer_worker_1 I> before txn_commit_stmt
sA-1 | 2022-01-14 15:28:24.001 [84516] main/161/vshard.rebalancer_worker_1 I> in bucket generation_increment
sA-1 | 2022-01-14 15:28:24.001 [84516] main/161/vshard.rebalancer_worker_1 I> after txn_commit_stmt
sA-1 | 2022-01-14 15:28:24.001 [84516] main/138/vshard.gc I> in gc last bucket=--- [300, 'sending', 'bbbbbbbb-0000-0000-0000-000000000000']
sA-1 | 2022-01-14 15:28:24.001 [84516] main/161/vshard.rebalancer_worker_1 I> in rebalancer after last bucket replace
sA-1 | 2022-01-14 15:28:24.001 [84516] main/161/vshard.rebalancer_worker_1 I> last_tuple=300
sA-1 | 2022-01-14 15:28:24.001 [84516] main/161/vshard.rebalancer_worker_1 I> last_bucket=--- [300, 'sent', 'bbbbbbbb-0000-0000-0000-000000000000']

As described above, the bucket garbage collector never enters its loop after this point (and hence never collects the last bucket with id=300), as the bucket generation state got updated, even though the garbage collector did not see the last replaced bucket in consts.BUCKET.SENT state.

Consideration

Consulted @yngvar-antonsson, @filonenko-mikhail and @Gerold103: all consider the described behaviour a bug in vshard.