linkedin / cruise-control

Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of Kafka clusters.
https://github.com/linkedin/cruise-control/tags
BSD 2-Clause "Simplified" License
2.74k stars 587 forks source link

Compare exact member of ISR with RS when deciding whether the partition is URP #1905

Closed CCisGG closed 1 year ago

CCisGG commented 2 years ago

Today, the way we determine a partition is an URP (Under Replicated Partition) based on the size of ISR and RS. E.g. if ISR = [A, B], and RS = [A, B, C], then the partition is treated as URP by CC.

However, in certain cases (usually because kafka is in a bad/inconsistent state), the ISR can be diverged from RS. E.g. ISR = [A, B, B], while RS = [D, E, F]. In this case, we still want CC to report this replica as URP, since the partition is not in a good state.

The proposal is, instead of comparing the size of ISR and RS, we should comparing the members in ISR and RS. If ISR != RS, CC should report it as URP.

suyashtava commented 2 years ago

Hi, @CCisGG , may I know if someone is working on it, if not? Can I pick this up

We are using CC extensible at Salesforce, and will be happy to contribute.

Mostly the change will go here: https://github.com/linkedin/cruise-control/blob/ffe8aa15d7bb63d2b23e52b720fdceeade6fd8f6/cruise-control/src/main/java/com/linkedin/kafka/cruisecontrol/servlet/response/ClusterPartitionState.java#L115

suyashtava commented 2 years ago

Btw, I am interested to know, when ISR = [A, B, B], while RS = [D, E, F] This condition will happen.

I assume that A,B, D, E, F denote Brokers, right? and by RS, u meant to say current Replica, and ISR is inSync

CCisGG commented 2 years ago

@suyashtava Sure! Please feel free to take it!

It happened recently in our kafka cluster, where there was a weird bug in ZooKeeper that different ZK host has different Data. Since Kafka relies on ZooKeeper, it messed up the replica set when it reads data from zookeeper. This should rarely happens, but just in case it happens, we should make this change.

AnvarKhatik commented 2 years ago

Hey @suyashtava You're Still Working On This Issue

suyashtava commented 2 years ago

Yup, yup. I was on Call last week, didn’t get a chance. Will fix it this week.

Though I am trying to replicate the issue, but not able to replicate @CCisGG

On Tue, 27 Sep 2022 at 11:28 AM, Anvar Khatik @.***> wrote:

Hey @suyashtava https://github.com/suyashtava You're Still Working On This Issue

— Reply to this email directly, view it on GitHub https://github.com/linkedin/cruise-control/issues/1905#issuecomment-1259015922, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALNN2SZOZ4GAZLIOJJTMZZTWAKEGZANCNFSM6AAAAAAQNVTISQ . You are receiving this because you were mentioned.Message ID: @.***>

CCisGG commented 2 years ago

@suyashtava this is a rare case and it's hard to reproduce. I think it's ok to just change the logic and add some unit test for it.

suyashtava commented 2 years ago

Raised a PR: https://github.com/linkedin/cruise-control/pull/1923