strimzi / strimzi-kafka-operator

Apache Kafka® running on Kubernetes
https://strimzi.io/
Apache License 2.0
4.8k stars 1.28k forks source link

Automatically set CC `hard.goals` to match `default.goals` when not provided #8204

Closed kyguy closed 1 year ago

kyguy commented 1 year ago

Is your feature request related to a problem? Please describe. Recently observed a user set a custom default.goals list in the .spec.cruiseControl.config section of a Kafka resource.

kind: Kafka
metadata:
  ...
spec
  cruiseControl:
    config:
      default.goals: >
        com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal
  ...

Since the the provided default.goals list was not a superset of the default hard.goals list, [1] Cruise Control threw the following error:

2023-03-07 14:00:33 ERROR KafkaCruiseControlMain:33 - Uncaught exception on thread Thread[main,5,main]
org.apache.kafka.common.config.ConfigException: Attempt to configure hard goals with unsupported goals (hard.goals:[com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuCapacityGoal] and default.goals:[com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal]).
    at com.linkedin.kafka.cruisecontrol.config.KafkaCruiseControlConfig.sanityCheckGoalNames(KafkaCruiseControlConfig.java:210) ~[cruise-control-2.5.103.redhat-00004.jar:2.5.103.redhat-00004]
    at com.linkedin.kafka.cruisecontrol.config.KafkaCruiseControlConfig.<init>(KafkaCruiseControlConfig.java:53) ~[cruise-control-2.5.103.redhat-00004.jar:2.5.103.redhat-00004]
    at com.linkedin.kafka.cruisecontrol.config.KafkaCruiseControlConfig.<init>(KafkaCruiseControlConfig.java:47) ~[cruise-control-2.5.103.redhat-00004.jar:2.5.103.redhat-00004]
    at com.linkedin.kafka.cruisecontrol.KafkaCruiseControlUtils.readConfig(KafkaCruiseControlUtils.java:866) ~[cruise-control-2.5.103.redhat-00004.jar:2.5.103.redhat-00004]
    at com.linkedin.kafka.cruisecontrol.KafkaCruiseControlMain.main(KafkaCruiseControlMain.java:35) ~[cruise-control-2.5.103.redhat-00004.jar:2.5.103.redhat-00004]

This behavior is expected, we are even warned about setting the value of default.goals in the Cruise Control documentation:

The value must be a subset of the "goals" and a superset of the "hard.goals" and "self.healing.goals".

The way the Kafka resource should have been configured to avoid this problem would have been:

kind: Kafka
metadata:
  ...
spec
  cruiseControl:
    config:
      default.goals: >
        com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal
      hard.goals: >
        com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal
  ...

Describe the solution you'd like It would be a nicer UX if we had Strimzi automatically configure the hard.goals to match user provided default.goals when users provide a default.goals list but not a hard.goals list. We already see similair auto-configuration of goal lists happening like this in Cruise Control and Strimzi:

This change would provide users with a simpler goal configuration experience when they just want to balance their cluster based on a simple custom goal list.

Describe alternatives you've considered

Two alternatives:

  1. Add more explict examples + documentation: I recently opened a PR that provides an example and explicitly documents that "default.goals must be a superset of the hard.goals" to help guide users from making configuration mistakes like this. That being said, I feel the concept of goals and the variations/restrictions of configuration are already confusing enough as is.
  2. Add this auto-configuration to upstream Cruise Control: This is something we could still do, it would certainly save Strimzi the trouble. I still need to check with the upstream Cruise Control maintainers.

Additional context [1] https://github.com/strimzi/strimzi-kafka-operator/blob/main/cluster-operator/src/main/java/io/strimzi/operator/cluster/model/CruiseControlConfiguration.java#L53-L60 [2] https://github.com/linkedin/cruise-control/wiki/Configurations [3] https://github.com/strimzi/strimzi-kafka-operator/blob/main/cluster-operator/src/main/java/io/strimzi/operator/cluster/model/CruiseControl.java#L288-L295

ppatierno commented 1 year ago

Wondering if the warning we have today helps the user to avoid mistakes while if we autoconfigure hard goals, the user cannot see his mistake. By mistake I mean, the user set some default goals but he forgot an hard goal in that list. If we raise the warning he could get the mistake he is making. If we silently set the hard goals equals to default goals, he cannot realize the mistake.

This issue sounds to be also related to the one we raised few weeks ago about having the CC configuration easily accessible so the users can see what are the configured goals.

scholzj commented 1 year ago

We should also think about backwards compatibility. Does the situation with the goals not set always result in error? What is user adds many goals to the default goals but want to stick with the original hard goals? Will we now suddenly jump in and silently change the hard goals to match the default goals?

scholzj commented 1 year ago

Triaged on 23.3.2023:

ppatierno commented 1 year ago

@mimaison suggests setting the hard.goals automatically is a bad idea.

I missed the community call, so let me re-inforce my opinion here. As I wrote already before, I agree with Mickael. Setting hard goals automatically makes the user unaware of the "mistakes" he made and doesn't clarify what to do instead. I think that the warning we have today is enough.

kyguy commented 1 year ago

@ppatierno and @mimaison are right, setting hard.goals to match the default.goals could potentially makes proposal generation a lot more difficult since not all default.goals are intended to be "hard" meaning required to to be satisfied. Some default.goals are intended to be "soft" meaning not required to be satisfied but satisfied at best effort. We shouldn't assume the user will want to all default.goals to match the hard.goals, otherwise we are trading one problem for another when all the listed default.goals cannot be satisfied and the user is required to provide a hard.goal list so that a proposal can be satisfied.

Since the examples and documentation have been updated to address this configuration issue [1], we can close this issue.

This issue sounds to be also related to the one we raised few weeks ago about having the CC configuration easily accessible so the users can see what are the configured goals.

To go even further, we can implement the enhancement Paolo mentioned here [2]

[1] https://github.com/strimzi/strimzi-kafka-operator/pull/8212 [2] https://github.com/strimzi/strimzi-kafka-operator/issues/8043