linkedin / cruise-control

Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of Kafka clusters.
https://github.com/linkedin/cruise-control/tags
BSD 2-Clause "Simplified" License
2.74k stars 587 forks source link

Feature Request: Modification of the RackAwareGoal behavior #911

Closed apellegr06 closed 3 years ago

apellegr06 commented 5 years ago

BackGround

Currently, cruise-control provides a RackAwareGoal to manager the partition rebalance among the cluster. This goal need to have a rack number >= to the biggest number of topic replicas.

If the rack number is lower than that, cruise-control doesn't be able to do a rebalance of partitions.

Issue

This prerequisite cannot be meet in some cases but the functionality of rebalance is still needed and could work.

For example in this case with a topic with 1 partition and 4 replicas :

Data Center A             | Data Center B
                          |
Broker1-rack A  replica 0 | Broker2-rack B
Broker3-rack A            | Broker4-rack B replica 1
Broker5-rack A  replica 2 | Broker6-rack B
Broker7-rack A            | Broker8-rack B replica 3

With that we are secured even if the Data Center B is totally down and Broker5 is down also, but in term of partition distribution it's not necessary to make a distinction between brokers in the same Data Center.

Proposal

Change the behavior of the rebalance regarding the RackAwareGoal by adding an optional paramter to force the minimum rack number instead of setting it with the biggest number of topic replicas.

efeg commented 5 years ago

Thanks for the feature request!

I believe rather than modifying the current behavior of RackAwareGoal, alternatively, we can add a new pluggable goal to satisfy these new requirements. This new goal would be identified as a new soft goal that ideally extends from replica distribution goal with a custom logic of distributing replicas over racks.

apellegr06 commented 5 years ago

Great ! It will facilitate the usage of cruise control for me !

Thank you

mkandi commented 4 years ago

Thank you for the great work team 👍

We have a similar request too. We use AWS and use 3 availability zones (AZs) in us1 - which map to 3 racks in cruise-control, but some of the topics has replication factor of more than 3.

To do any rebalancing operations, we had to unselect the Rack Aware goal and use Skip Hard Goal Check option. Unfortunately, this seems to not consider the rack awareness goal at all even for topics with replication factor <= 3, i.e., its assigning 2 replicas to the brokers on the same rack for a topic/partition with replication factor of 3.

As you have suggested, we will also consider writing soft/pluggable goal to distribute them evenly across all racks.

Please share if anyone has any suggestions/improvements.

Thank you

mkandi commented 4 years ago

@efeg and Team, we have written a new pluggable goal to satisfy this requirement and would like to contribute it back. Hope this is ok.

efeg commented 4 years ago

@mkandi Yes that would be great -- we look forward to your PR!

mkandi commented 4 years ago

thank you. working on adding the tests and will be sending it soon

mkandi commented 4 years ago

@efeg and Team,

Added 2 new files for this new goal and modified a bunch of src/test files. Few questions/clarifications:

Please advice

efeg commented 4 years ago

Hi @mkandi

License requirements

For open source contributions, no CLA is needed, but please ensure that you perused our Contributing file, which implicitly prevents the modification of license file.

Test setup/running the tests

The gradlew clean test should run without errors. Does the failure happen in existing tests or new tests you added? Would you share more about the error -- e.g. stack trace?

mkandi commented 4 years ago

Hi @efeg

Thank you for clarifications. Sorry it took a while to get the clarification/approval from our company. just to clarify your above statement regarding no CLA is needed, so I need to submit all the code (including the new files) with the following license, right (the code is certainly expecting the license and LinkedIn in it):

Copyright 2020 LinkedIn Corp. Licensed under the BSD 2-Clause License (the "License"). See License in the project root for license information.

Please clarify.

Coming to the test failures, it appears they are MAC related. I have tried flushing the routing table etc. but still keep getting the same exception. Here is the stack trace:


    java.lang.IllegalStateException: java.net.BindException: Can't assign requested address
        at com.linkedin.kafka.cruisecontrol.metricsreporter.utils.CCEmbeddedZookeeper.<init>(CCEmbeddedZookeeper.java:37)
        at com.linkedin.kafka.cruisecontrol.metricsreporter.utils.CCAbstractZookeeperTestHarness.setUp(CCAbstractZookeeperTestHarness.java:16)
        at com.linkedin.kafka.cruisecontrol.metricsreporter.utils.CCKafkaIntegrationTestHarness.setUp(CCKafkaIntegrationTestHarness.java:23)
        at com.linkedin.kafka.cruisecontrol.CruiseControlIntegrationTestHarness.start(CruiseControlIntegrationTestHarness.java:42)
        at com.linkedin.kafka.cruisecontrol.servlet.security.AuthenticationIntegrationTest.setup(AuthenticationIntegrationTest.java:55)

        Caused by:
        java.net.BindException: Can't assign requested address
            at sun.nio.ch.Net.bind0(Native Method)
            at sun.nio.ch.Net.bind(Net.java:433)
            at sun.nio.ch.Net.bind(Net.java:425)
            at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
            at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
            at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
            at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:687)
            at org.apache.zookeeper.server.ServerCnxnFactory.configure(ServerCnxnFactory.java:76)
            at com.linkedin.kafka.cruisecontrol.metricsreporter.utils.CCEmbeddedZookeeper.<init>(CCEmbeddedZookeeper.java:33)```

Any help is appreciated. thank you
efeg commented 4 years ago

@mkandi

so I need to submit all the code (including the new files) with the following license, right (the code is certainly expecting the license and LinkedIn in it):

Copyright 2020 LinkedIn Corp. Licensed under the BSD 2-Clause License (the "License"). See License in the project root for license information.

Correct. Since the existing files already have a license string, this is only needed for the new files.

Thanks for sharing the stack trace. Unfortunately, I cannot reproduce this exception locally. java.net.BindException: Can't assign requested address typically indicates that the port is in use, or the requested local address could not be assigned. However, InetSocketAddress(localHost, 0) lets the system pick up an ephemeral port in the bind operation. I am curious if there is a problem with the localhost resolution here (see the line). Would you please try running the failing integration test (i.e. AuthenticationIntegrationTest) with debug mode with a breakpoint on the line I pointed out and share its value?

mkandi commented 4 years ago

Thank you @efeg for your quick responses. Appreciate that very much.

Btw, I work for Twilio and am clarifying these license issues with our legal council. Here is a question from our legal council:

why would LinkedIn want to incorrectly misrepresent that they are the copyright owner of new files even though they are not

Btw, the question is around the license requirements for the new files that we will be adding (as you know, I have 2 new files related to this new goal). The council is perfectly fine with changes to existing files - which is under LinkedIn Corp's name, question is mainly for thew newly added files.

Could you please help clarify the above?

Coming to the test failures, it appears something weird with my MAC. The localhost - at the place you have mentioned - resolving to 0.0.16.115 and its not the IP address on any of the existing interfaces. I will experiment a bit more and will get back to you.

efeg commented 4 years ago

Hi @mkandi -- for the license requirements, our legal team asks contributors to agree with Contribution Agreement, which says: By submitting code, you (and, if applicable, your employer) are licensing the submitted code to LinkedIn and the open source community subject to the BSD 2-Clause license..

For the test failures, I hope you were able to resolve them on your end as I cannot reproduce this exception locally.

apellegr06 commented 4 years ago

Hello,

Have you some news about this pluggable goal ? I'm very interested.

Thank you

efeg commented 4 years ago

@apellegr06 In short term (i.e. within this month or so), we intend to add a new goal or modify the existing RackAwareGoal to support a "best-effort" distribution of replicas over the racks for partitions with replication factor > number of racks.

apellegr06 commented 4 years ago

Great ! Thank you @efeg