TiDB operator does not delete original configmap after user changes the config in the cr, causing resource leak

Bug Report

What version of Kubernetes are you using? Client Version: v1.31.0 Kustomize Version: v5.4.2 Server Version: v1.29.1

What version of TiDB Operator are you using? v1.6.0

What's the status of the TiDB cluster pods? All pods are in Running state

What did you do? We updated the spec.tikv.config field to a different non-empty value.

How to reproduce

Deploy a TiDB cluster, for example:

apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
name: test-cluster
spec:
configUpdateStrategy: RollingUpdate
enableDynamicConfiguration: true
helper:
image: alpine:3.16.0
pd:
baseImage: pingcap/pd
config: "[dashboard]\n  internal-proxy = true\n"
maxFailoverCount: 0
mountClusterClientSecret: true
replicas: 3
requests:
  storage: 10Gi
pvReclaimPolicy: Retain
ticdc:
baseImage: pingcap/ticdc
replicas: 3
tidb:
baseImage: pingcap/tidb
config: "[performance]\n  tcp-keep-alive = true\ngraceful-wait-before-shutdown\
  \ = 30\n"
maxFailoverCount: 0
replicas: 3
service:
  externalTrafficPolicy: Local
  type: NodePort
tiflash:
baseImage: pingcap/tiflash
replicas: 3
storageClaims:
- resources:
    requests:
      storage: 10Gi
tikv:
baseImage: pingcap/tikv
config: |
  [raftdb]
    max-open-files = 256
  [rocksdb]
    max-open-files = 256
maxFailoverCount: 0
mountClusterClientSecret: true
replicas: 3
requests:
  storage: 100Gi
timezone: UTC
version: v8.1.0

Change the spec.tikv.config to another non-empty value, e.g.

apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
name: test-cluster
spec:
configUpdateStrategy: RollingUpdate
enableDynamicConfiguration: true
helper:
image: alpine:3.16.0
pd:
baseImage: pingcap/pd
config: "[dashboard]\n  internal-proxy = true\n"
maxFailoverCount: 0
mountClusterClientSecret: true
replicas: 3
requests:
  storage: 10Gi
pvReclaimPolicy: Retain
ticdc:
baseImage: pingcap/ticdc
replicas: 3
tidb:
baseImage: pingcap/tidb
config: "[performance]\n  tcp-keep-alive = true\ngraceful-wait-before-shutdown\
  \ = 30\n"
maxFailoverCount: 0
replicas: 3
service:
  externalTrafficPolicy: Local
  type: NodePort
tiflash:
baseImage: pingcap/tiflash
replicas: 3
storageClaims:
- resources:
    requests:
      storage: 10Gi
tikv:
baseImage: pingcap/tikv
config: |
  [raftdb]
    max-open-files = 256
  [rocksdb]
    max-open-files = 128
maxFailoverCount: 0
mountClusterClientSecret: true
replicas: 3
requests:
  storage: 100Gi
timezone: UTC
version: v8.1.0

What did you expect to see? We expected to see that the unused ConfigMaps are garbage collected by the TiDB operator. This prevents the operator from keeping generating new ConfigMaps and adding more objects into the etcd.

What did you see instead? The operator created a new ConfigMap for TiKV but left the old ConfigMap undeleted. We observed the same behavior when updating spec.tiflash.config, which suggests that all TiDB components are likely affected by this issue.

pingcap / tidb-operator

TiDB operator does not delete original configmap after user changes the config in the cr, causing resource leak #5741

Bug Report