apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.11k stars 174 forks source link

[BUG] Endless affinity list when using new schudlingPolicy field #8105

Open cjc7373 opened 1 month ago

cjc7373 commented 1 month ago

Describe the bug Endless affinity list when using new schudlingPolicy field and a DATA_PLANE_AFFINITY in config

To Reproduce Steps to reproduce the behavior:

  1. Have a DATA_PLANE_AFFINITY: '{"nodeAffinity":{"preferredDuringSchedulingIgnoredDuringExecution":[{"preference":{"matchExpressions":[{"key":"kb-data","operator":"In","values":["true"]}]},"weight":100}]}}' field in the config
  2. Create a cluster CR, with schedulingPolicy field not null. e.g.
    spec:
      schedulingPolicy:
        schedulerName: custom-scheduler
  3. cluster can't finish reconciling

The cluster CR will become like:

apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
spec:
  ...
  schedulingPolicy:
    affinity:
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
            - key: kb-data
              operator: In
              values:
              - "true"
          weight: 100
        - preference:
            matchExpressions:
            - key: kb-data
              operator: In
              values:
              - "true"
          weight: 100
        - preference:
            matchExpressions:
            - key: kb-data
              operator: In
              values:
              - "true"
          weight: 100
        [repeat hundreds of times]
        ...
    schedulerName: custom-scheduler
    tolerations:
    - effect: NoSchedule
      key: kb-data
      operator: Equal
      value: "true"
    - effect: NoSchedule
      key: kb-data
      operator: Equal
      value: "true"
    - effect: NoSchedule
      key: kb-data
      operator: Equal
      value: "true"
    [repeat hundreds of times]
    ...

Additional context The root cause is that the generated scheduling policy is patched to cluster CR itself. Though it only should be in the component CR.

On a deeper view, this bug happened because cluster controller would modify cluster CR's spec. In which use case do we need this behavior? After all, the general idea is that a controller will not modify the CR's spec it reconciles.

github-actions[bot] commented 2 weeks ago

This issue has been marked as stale because it has been open for 30 days with no activity