strimzi / strimzi-canary

Strimzi canary
Apache License 2.0
42 stars 28 forks source link

Canary topic seen to ignore rack awareness - 6 brokers cluster across 3 AZs #190

Closed k-wall closed 2 years ago

k-wall commented 2 years ago

Testing canary 0.2.0 with a 6 broker kafka cluster deployed to kubernetes with nodes across three AZs, we noticed the canary topic partitions were not respecting the kafka rack awareness feature.

The partition assignment looked like this:

./bin/kafka-topics.sh {}topic __redhat_strimzi_canary{-}  -describe --bootstrap-server localhost:9096
Topic: __redhat_strimzi_canary    TopicId: T-cSuR5_Te6nEYMec3xSjg    PartitionCount: 6    ReplicationFactor: 3    Configs: min.insync.replicas=2,cleanup.policy=delete,segment.bytes=16384,retention.ms=600000,message.format.version=3.0-IV1,max.message.bytes=1048588
    Topic: __redhat_strimzi_canary    Partition: 0    Leader: 0    Replicas: 0,1,2    Isr: 0,1,2
    Topic: __redhat_strimzi_canary    Partition: 1    Leader: 1    Replicas: 1,2,3    Isr: 1,2,3
    Topic: __redhat_strimzi_canary    Partition: 2    Leader: 2    Replicas: 2,3,4    Isr: 2,3,4
    Topic: __redhat_strimzi_canary    Partition: 3    Leader: 3    Replicas: 3,4,5    Isr: 3,4,5
    Topic: __redhat_strimzi_canary    Partition: 4    Leader: 4    Replicas: 4,5,0    Isr: 4,5,0
    Topic: __redhat_strimzi_canary    Partition: 5    Leader: 5    Replicas: 5,0,1    Isr: 5,0,1

For instance partition 2 is on 2,3,4 but brokers 2 and 3 are both in AZ 1b.

oc logs kafka-instance-kafka-2 | grep -E ^broker.rack broker.rack=us-east-1b

oc logs kafka-instance-kafka-3 | grep -E ^broker.rack broker.rack=us-east-1b

The canary topic was created by the canary:

I0610 13:03:45.234567       1 topic.go:162] The canary topic __redhat_strimzi_canary was created
I0610 13:03:45.234593       1 consumer.go:135] Waiting consumer group to be up and running

The brokers seem to have been up before that.

kwall@Oslo kas-installer % oc logs kafka-instance-kafka-0 | grep "Startup complete"
2022-06-10T13:03:24Z INFO  [main] [GroupCoordinator] [GroupCoordinator 0]: Startup complete.
2022-06-10T13:03:24Z INFO  [main] [TransactionCoordinator] [TransactionCoordinator id=0] Startup complete.
kwall@Oslo kas-installer % oc logs kafka-instance-kafka-1 | grep "Startup complete"
2022-06-10T13:03:25Z INFO  [main] [GroupCoordinator] [GroupCoordinator 1]: Startup complete.
2022-06-10T13:03:25Z INFO  [main] [TransactionCoordinator] [TransactionCoordinator id=1] Startup complete.
kwall@Oslo kas-installer % oc logs kafka-instance-kafka-2 | grep "Startup complete"
2022-06-10T13:03:21Z INFO  [main] [GroupCoordinator] [GroupCoordinator 2]: Startup complete.
2022-06-10T13:03:21Z INFO  [main] [TransactionCoordinator] [TransactionCoordinator id=2] Startup complete.
kwall@Oslo kas-installer % oc logs kafka-instance-kafka-3 | grep "Startup complete"
2022-06-10T13:03:27Z INFO  [main] [GroupCoordinator] [GroupCoordinator 3]: Startup complete.
2022-06-10T13:03:27Z INFO  [main] [TransactionCoordinator] [TransactionCoordinator id=3] Startup complete.
kwall@Oslo kas-installer % oc logs kafka-instance-kafka-4 | grep "Startup complete"
2022-06-10T13:03:30Z INFO  [main] [GroupCoordinator] [GroupCoordinator 4]: Startup complete.
2022-06-10T13:03:30Z INFO  [main] [TransactionCoordinator] [TransactionCoordinator id=4] Startup complete.
kwall@Oslo kas-installer % oc logs kafka-instance-kafka-5 | grep "Startup complete"
2022-06-10T13:03:34Z INFO  [main] [GroupCoordinator] [GroupCoordinator 5]: Startup complete.
2022-06-10T13:03:35Z INFO  [main] [TransactionCoordinator] [TransactionCoordinator id=5] Startup complete.

so I don't think this is to do with the canary coming up before some of the brokers.

ppatierno commented 2 years ago

Closed via #191