linkedin / ambry

Distributed object store
https://github.com/linkedin/ambry/wiki
Apache License 2.0
1.74k stars 275 forks source link

Changing replication logic to not wait for replicas to finish #2819

Closed manbearpig1996 closed 1 week ago

manbearpig1996 commented 2 months ago

Changing the logic of replication cycle. Currently in a replication cycle, we will create groups and then wait for all groups to finish. With this change, we all replica groups will run multiple iterations until any group reaches a predefined limit. Also added logic to terminate cycle, when we are adding/removing replicas or shutting the thread down.

Added tests for group generation, active replication & leader based replication.

If we set limit to 1, then the new logic will work exactly as same as old logic.

snalli commented 1 month ago

For large values of iterations in a cycle, we would not pick up any new replicas added to a server because that set is refreshed only at teh end of a cycle. Similarly we won't pick up any removals. Is this correct? @justinlin-linkedin