Yelp / kafka-utils

Apache License 2.0
313 stars 127 forks source link

set_replication_factor command to handle mismatched replication factor partitions #250

Closed Baisang closed 4 years ago

Baisang commented 4 years ago

Some manual testing:

Suppose we have a topic with mismatched topic RF like so:

baisang@noneofyourbusiness ~/kafka-utils> kafka-info -t test topic test_topic
<...>
Partitions details:
  Partition 0: Total size 0.0B
    Leader <redacted>: Start offset 4, End offset 4, Size 0.0B
    Follower <redacted>: Insync True, End offset 4, Size 0.0B
  Partition 1: Total size 0.0B
    Leader <redacted>: Start offset 0, End offset 0, Size 0.0B
Topic specific configuration:
    None

Old set_replication_factor can't handle this situation properly:

baisang@noneofyourbusiness ~/kafka-utils> kafka-cluster-manager -t test set_replication_factor --topic test_topic 2
INFO:kazoo.client:Connecting to my_zk:2181
INFO:kazoo.client:Zookeeper connection established, state: CONNECTED
INFO:kafka-zookeeper-manager:Fetching current cluster-topology from Zookeeper...
INFO:SetReplicationFactorCmd:Increasing topic test_topic replication factor from 1 to 2.
INFO:SetReplicationFactorCmd:Total number of actions before reduction: 2.
INFO:SetReplicationFactorCmd:Number of partition changes: 2. Number of leader-only changes: 0
INFO:SetReplicationFactorCmd:Proposed plan assignment {'version': 1, 'partitions': [{'topic': u'test_topic', 'partition': 0, 'replicas': [<redacted>, <redacted>, <redacted>]}, {'topic': u'test_topic', 'partition': 1, 'replicas': [<redacted>, <redacted>]}]}
INFO:SetReplicationFactorCmd:Proposed-plan actions count: 2
INFO:SetReplicationFactorCmd:Proposed plan won't be executed (--apply and confirmation needed).
INFO:kazoo.client:Closing connection to my_zk:2181
INFO:kazoo.client:Zookeeper session lost, state: CLOSED

But using the new tool, it can work (observe it requires all of the proper flags passed in):

baisang@noneofyourbusiness ~/kafka-utils> ./kafka-cluster-manager -t test --apply set_replication_factor --topic test_topic 2
INFO:kazoo.client:Connecting to my_zk:2181, use_ssl: False
INFO:kazoo.client:Zookeeper connection established, state: CONNECTED
INFO:kafka-zookeeper-manager:Fetching current cluster-topology from Zookeeper...
INFO:SetReplicationFactorCmd:Increasing topic partition test_topic:1 replication factor from 1 to 2.
INFO:SetReplicationFactorCmd:Total number of actions before reduction: 1.
INFO:SetReplicationFactorCmd:Number of partition changes: 1. Number of leader-only changes: 0
INFO:SetReplicationFactorCmd:Proposed plan assignment {'version': 1, 'partitions': [{'topic': 'test_topic', 'partition': 1, 'replicas': [<redacted>, <redacted>]}]}
INFO:SetReplicationFactorCmd:Proposed-plan actions count: 1
Execute Proposed Plan? [yes/no] yes
INFO:kafka-zookeeper-manager:Fetching current cluster-topology from Zookeeper...
ERROR:kafka_utils.util.validation:Mismatch in replication-factor of partitions for topic test_topic
ERROR:kafka_utils.util.validation:Invalid assignment from cluster.
ERROR:kafka-zookeeper-manager:Given plan is invalid. Aborting new reassignment plan ... {'version': 1, 'partitions': [{'topic': 'test_topic', 'partition': 1, 'replicas': [<redacted>, <redacted>]}]}
ERROR:SetReplicationFactorCmd:Plan execution unsuccessful.
INFO:kazoo.client:Closing connection to my_zk:2181
INFO:kazoo.client:Zookeeper session lost, state: CLOSED

baisang@noneofyourbusiness ~/kafka-utils> ./kafka-cluster-manager -t test --apply set_replication_factor --topic test_topic 2 --rf-mismatch
INFO:kazoo.client:Connecting to my_zk:2181, use_ssl: False
INFO:kazoo.client:Zookeeper connection established, state: CONNECTED
INFO:kafka-zookeeper-manager:Fetching current cluster-topology from Zookeeper...
INFO:SetReplicationFactorCmd:Increasing topic partition test_topic:1 replication factor from 1 to 2.
INFO:SetReplicationFactorCmd:Total number of actions before reduction: 1.
INFO:SetReplicationFactorCmd:Number of partition changes: 1. Number of leader-only changes: 0
INFO:SetReplicationFactorCmd:Proposed plan assignment {'version': 1, 'partitions': [{'topic': 'test_topic', 'partition': 1, 'replicas': [<redacted>, <redacted>]}]}
INFO:SetReplicationFactorCmd:Proposed-plan actions count: 1
Execute Proposed Plan? [yes/no] yes
INFO:kafka-zookeeper-manager:Fetching current cluster-topology from Zookeeper...
INFO:kafka-zookeeper-manager:Sending plan to Zookeeper...
INFO:kafka-zookeeper-manager:Re-assign partitions node in Zookeeper updated successfully with {'version': 1, 'partitions': [{'topic': 'test_topic', 'partition': 1, 'replicas': [<redacted>, <redacted>]}]}
INFO:SetReplicationFactorCmd:Plan sent to zookeeper for reassignment successfully.
INFO:kazoo.client:Closing connection to my_zk:2181
INFO:kazoo.client:Zookeeper session lost, state: CLOSED