apple / foundationdb

FoundationDB - the open source, distributed, transactional key-value store
https://apple.github.io/foundationdb/
Apache License 2.0
14.37k stars 1.3k forks source link

Clients fail to send read requests to all replicas #2895

Open ajbeamon opened 4 years ago

ajbeamon commented 4 years ago

We recently encountered a situation where a two-storage-server team was hot with reads. At some point, one of these storage servers started experiencing noticeably more reads than the other. We decided to exclude the worse one, and after data movement completed we ended up in a state where only a single storage server was hot. A subsequent exclude of the new process had the same outcome.

Eventually we bounced the clients, and load started to distribute evenly between the two storage servers again.

xumengpanda commented 4 years ago

Is there a way to reproduce this behavior in our simulation test? I'm thinking loud: if we can record in a client which SS sends the data to it and check the number of replies from each SS in a team is "roughly" same, we might be able to reproduce it?