apple / foundationdb

FoundationDB - the open source, distributed, transactional key-value store
https://apple.github.io/foundationdb/
Apache License 2.0
14.47k stars 1.31k forks source link

Data rebalancing may get stuck when Redwood is too fast in write #3664

Open xumengpanda opened 4 years ago

xumengpanda commented 4 years ago

In a fast restore experiment, we (Evan and I) noticed that a two-member SS team gets 200MB/s (?) input (write) bytes and ~30MB/s output (read) bytes. The cluster uses Redwood storage engine with double replication.

Data distribution cannot finish relocating shards for rebalancing the load, because destination SSs cannot catch up in reading/moving data away from the hammered hot SSs.

We may need to experiment with the write-heavy workload for Redwood and confirm if this can happen. If so, we need to see if the tag-throttling feature can solve this situation.

ajbeamon commented 4 years ago

Evan to reassign