pingcap / tidb-operator

TiDB operator creates and manages TiDB clusters running in Kubernetes.
https://docs.pingcap.com/tidb-in-kubernetes/
Apache License 2.0
1.23k stars 495 forks source link

TiDB operator cannot update configuration for TiKV #5729

Open kos-team opened 4 weeks ago

kos-team commented 4 weeks ago

Bug Report

What version of Kubernetes are you using?

1.28.1

What version of TiDB Operator are you using?

1.6.0

What storage classes exist in the Kubernetes cluster and what are used for PD/TiKV pods?

standard

What's the status of the TiDB cluster pods?

What did you do?

We tried to change the storage.engine option of TiKV, via the spec.tikv.config property. But we found that after changing storage.engine from raft-kv to partitioned-raft-kv, the TiKV keeps using raft-kv as the storage engine.

  1. Install TiDB cluster by applying the CR.
    apiVersion: pingcap.com/v1alpha1
    kind: TidbCluster
    metadata:
    name: basic
    spec:
    version: v8.1.0
    timezone: UTC
    pvReclaimPolicy: Retain
    enableDynamicConfiguration: true
    configUpdateStrategy: RollingUpdate
    discovery: {}
    helper:
    image: alpine:3.16.0
    pd:
    baseImage: pingcap/pd
    maxFailoverCount: 0
    replicas: 1
    requests:
      storage: "1Gi"
    config: {}
    tikv:
    baseImage: pingcap/tikv
    maxFailoverCount: 0
    evictLeaderTimeout: 1m
    replicas: 1
    requests:
      storage: "1Gi"
    config:
      storage:
        # In basic examples, we set this to avoid using too much storage.
        reserve-space: "0MB"
        engine: "raft-kv"
      rocksdb:
        # In basic examples, we set this to avoid the following error in some Kubernetes clusters:
        # "the maximum number of open file descriptors is too small, got 1024, expect greater or equal to 82920"
        max-open-files: 256
      raftdb:
        max-open-files: 256
    tidb:
    baseImage: pingcap/tidb
    maxFailoverCount: 0
    replicas: 1
    service:
      type: ClusterIP
    config: {}
  2. Change the spec.tikv.config.storage.engine to partitioned-raft-kv
  3. Use mysqlsh to query the current configuration, and still see the engine is raft-kv
    +------+----------------------------------------------------------------+----------------+---------+
    | Type | Instance                                                       | Name           | Value    |
    +------+----------------------------------------------------------------+----------------+---------+
    | tikv | advanced-tidb-tikv-2.advanced-tidb-tikv-peer.default.svc:20160 | storage.engine | raft-kv |
    | tikv | advanced-tidb-tikv-1.advanced-tidb-tikv-peer.default.svc:20160 | storage.engine | raft-kv |
    | tikv | advanced-tidb-tikv-0.advanced-tidb-tikv-peer.default.svc:20160 | storage.engine | raft-kv |
    +------+----------------------------------------------------------------+----------------+---------+

What did you expect to see? Storage engine should be updated, or the config change should be rejected explicitly.

What did you see instead? Although the Pods are rolling restarted with the updated ConfigMap, the actual configuration being used by the TiKV is not updated. We suspect that the updated configuration gets lost when the TiDB merges the configuration.

csuzhangxc commented 3 weeks ago

https://docs.pingcap.com/tidb/stable/tikv-configuration-file#engine-new-in-v660

This configuration can only be specified when creating a new cluster and cannot be modifies once being specified.

But as you said, reject it explicitly is better