pingcap / tidb-operator

TiDB operator creates and manages TiDB clusters running in Kubernetes.
https://docs.pingcap.com/tidb-in-kubernetes/
Apache License 2.0
1.2k stars 490 forks source link

br: `.spec.pause` is missing from backupschedule manifest after backup is created #5610

Open hongshaoyang opened 3 months ago

hongshaoyang commented 3 months ago

Bug Report

What version of Kubernetes are you using?

Server Version: v1.26.11-gke.1055000

What version of TiDB Operator are you using?

TiDB Operator Version: version.Info{GitVersion:"v1.5.1", GitCommit:"2802a0834c50dab95e5eb4409dfbcc9717330721", GitTreeState:"clean", BuildDate:"2023-10-20T08:13:25Z", GoVersion:"go1.21.3", Compiler:"gc", Platform:"linux/amd64"}

What storage classes exist in the Kubernetes cluster and what are used for PD/TiKV pods?

(irrelevant)

What's the status of the TiDB cluster pods?

(irrelevant)

What did you do?

  1. Apply BackupSchedule via kubectl apply:
apiVersion: pingcap.com/v1alpha1
kind: BackupSchedule
metadata:
  name: "${var.backup_schedule_name}"
  namespace: "${var.backup_schedule_namespace}"
spec:
  maxBackups: ${var.backup_schedule_max_count}
  pause: ${var.is_backup_paused}
  schedule: "${var.backup_schedule_cron}"
  backupTemplate:
    backupType: "${var.backup_schedule_type}"
    toolImage: "pingcap/br:${var.backup_schedule_version}"
    br:
      cluster: "${var.tidb_cluster_name}"
      clusterNamespace: "${var.tidb_cluster_namespace}"
    gcs:
      projectId: "${var.gcp_project_id}"
      bucket: "${var.gcs_bucket_name}"
      prefix: "${var.gcs_bucket_prefix}"
      secretName: "${var.gcs_secret_name}"

What did you expect to see?

.spec.pause is present in kube manifest

What did you see instead?

.spec.pause was present after applying the kube manifest. however, after the backup was successfully created at 20:00 UTC, the .spec.pausefield became absent:

$ kubectl get backupschedule prod-v7-daily-backup -n prod-tidb -o yaml
apiVersion: pingcap.com/v1alpha1
kind: BackupSchedule
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: xx
  creationTimestamp: "2023-12-21T05:29:47Z"
  generation: 117
  name: prod-v7-daily-backup
  namespace: prod-tidb
  resourceVersion: "4477218365"
  uid: 42376c6b-70c8-47b5-845a-180898682b60
spec:
  backupTemplate:
    backoffRetryPolicy:
      maxRetryTimes: 2
      minRetryDuration: 300s
      retryTimeout: 30m
    backupMode: snapshot
    backupType: full
    br:
      cluster: prod-v7
      clusterNamespace: prod-tidb
    calcSizeLevel: all
    gcs:
      bucket: xx
      prefix: xx
      projectId: xx
      secretName: xx
    resources: {}
    toolImage: pingcap/br:v7.1.2
  maxBackups: 3
  schedule: 0 20 * * *
status:
  lastBackup: prod-v7-daily-backup-2024-04-08t20-00-00
  lastBackupTime: "2024-04-08T20:00:00Z"
csuzhangxc commented 2 months ago

@hongshaoyang did you set the value of paused (pause: ${var.is_backup_paused}) to true or false? some default values may be absent in the manifest.

hongshaoyang commented 2 months ago

@csuzhangxc - it is set to the default, false, that is, scheduled backups are not paused.

some default values may be absent in the manifest.

but that is not expected. this will cause issues with infrastructure-as-code tools like terraform which expect values to not change over time.

csuzhangxc commented 2 months ago

specially, for this pause: false, I tested it on Kubernetes server v1.27.3 and client v1.29.0, I can get it back with -o yaml.

Can you try it on other versions of server and client?

hongshaoyang commented 2 months ago

Our GKE clusters are standardised to be the same version across. So I cannot try it on other versions.

One observation is that after applying the Kube manifest, I also can view the pause: false field immediately with -o yaml. But after a while (after the backup was created) the .spec.pause field became absent.

csuzhangxc commented 2 months ago

but that is not expected. this will cause issues with infrastructure-as-code tools like terraform which expect values to not change over time.

One possible method to resolve it may not set these empty values in IaC tools as K8s/Golang-JSON has the omitempty behavior.