apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.16k stars 176 forks source link

[BUG]Mongo restored cluster is Abnormal #4685

Closed ahjing99 closed 1 year ago

ahjing99 commented 1 year ago

➜ ~ kbcli version Kubernetes: v1.27.2-gke.1200 KubeBlocks: 0.6.0-beta.29 kbcli: 0.6.0-beta.29

Restore mongo start to fail on beta.24 https://github.com/apecloud/kubeblocks/actions/runs/5784176695/job/15675084847

   `kbcli cluster list-instances mongocluster --namespace default `

NAME                     NAMESPACE   CLUSTER        COMPONENT   STATUS    ROLE        ACCESSMODE   AZ              CPU(REQUEST/LIMIT)   MEMORY(REQUEST/LIMIT)           STORAGE    NODE                                                  CREATED-TIME
mongocluster-mongodb-0   default     mongocluster   mongodb     Running   secondary   Readonly     us-central1-c   200m / 200m          858993459200m / 858993459200m   data:8Gi   gke-yjtest-default-pool-8e798dc1-2pmb/10.128.15.228   Aug 08,2023 11:54 UTC+0800
mongocluster-mongodb-1   default     mongocluster   mongodb     Running   primary     ReadWrite    us-central1-c   200m / 200m          858993459200m / 858993459200m   data:8Gi   gke-yjtest-default-pool-8e798dc1-4z3z/10.128.15.226   Aug 08,2023 11:52 UTC+0800
mongocluster-mongodb-2   default     mongocluster   mongodb     Running   secondary   Readonly     us-central1-c   200m / 200m          858993459200m / 858993459200m   data:8Gi   gke-yjtest-default-pool-8e798dc1-hvxp/10.128.15.227   Aug 08,2023 11:52 UTC+0800
check pod status done
check cluster connect
Unable to use a TTY - input is not a terminal or the right kind of file
^@check cluster connect done
cluster snapshot backup

      `kbcli cluster backup mongocluster --type snapshot --namespace default `

Backup backup-default-mongocluster-20230808115531 created successfully, you can view the progress:
    kbcli cluster list-backups --name=backup-default-mongocluster-20230808115531 -n default

      `kbcli cluster restore mongocluster-backup --backup backup-default-mongocluster-20230808115531 --namespace default `

Cluster mongocluster-backup created

➜  ~ k describe cluster mongocluster-backup
Name:         mongocluster-backup
Namespace:    default
Labels:       clusterdefinition.kubeblocks.io/name=mongodb
              clusterversion.kubeblocks.io/name=mongodb-5.0.14
Annotations:  kubeblocks.io/restore-from-backup: {"mongodb":"backup-default-mongocluster-20230808115531"}
API Version:  apps.kubeblocks.io/v1alpha1
Kind:         Cluster
Metadata:
  Creation Timestamp:  2023-08-08T03:56:24Z
  Finalizers:
    cluster.kubeblocks.io/finalizer
  Generation:  1
  Managed Fields:
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubeblocks.io/restore-from-backup:
      f:spec:
        .:
        f:affinity:
          .:
          f:podAntiAffinity:
          f:tenancy:
        f:clusterDefinitionRef:
        f:clusterVersionRef:
        f:componentSpecs:
          .:
          k:{"name":"mongodb"}:
            .:
            f:classDefRef:
              .:
              f:class:
            f:componentDefRef:
            f:enabledLogs:
              .:
              v:"running":
            f:monitor:
            f:name:
            f:noCreatePDB:
            f:replicas:
            f:resources:
              .:
              f:limits:
                .:
                f:cpu:
                f:memory:
              f:requests:
                .:
                f:cpu:
                f:memory:
            f:serviceAccountName:
            f:volumeClaimTemplates:
        f:monitor:
        f:resources:
          .:
          f:cpu:
          f:memory:
        f:storage:
          .:
          f:size:
        f:terminationPolicy:
    Manager:      kbcli
    Operation:    Update
    Time:         2023-08-08T03:56:24Z
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .:
          v:"cluster.kubeblocks.io/finalizer":
        f:labels:
          .:
          f:clusterdefinition.kubeblocks.io/name:
          f:clusterversion.kubeblocks.io/name:
    Manager:      manager
    Operation:    Update
    Time:         2023-08-08T03:56:25Z
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:clusterDefGeneration:
        f:components:
          .:
          f:mongodb:
            .:
            f:consensusSetStatus:
              .:
              f:leader:
                .:
                f:accessMode:
                f:name:
                f:pod:
            f:phase:
            f:podsReady:
            f:podsReadyTime:
        f:conditions:
        f:observedGeneration:
        f:phase:
    Manager:         manager
    Operation:       Update
    Subresource:     status
    Time:            2023-08-08T04:06:13Z
  Resource Version:  6810182
  UID:               7adb674f-8e76-4a86-9c18-1d55e6796bcd
Spec:
  Affinity:
    Pod Anti Affinity:     Preferred
    Tenancy:               SharedNode
  Cluster Definition Ref:  mongodb
  Cluster Version Ref:     mongodb-5.0.14
  Component Specs:
    Class Def Ref:
      Class:
    Component Def Ref:  mongodb
    Enabled Logs:
      running
    Monitor:        true
    Name:           mongodb
    No Create PDB:  false
    Replicas:       3
    Resources:
      Limits:
        Cpu:     200m
        Memory:  858993459200m
      Requests:
        Cpu:               200m
        Memory:            858993459200m
    Service Account Name:  kb-mongocluster
    Volume Claim Templates:
      Name:  data
      Spec:
        Access Modes:
          ReadWriteOnce
        Resources:
          Requests:
            Storage:  8Gi
  Monitor:
  Resources:
    Cpu:     0
    Memory:  0
  Storage:
    Size:              0
  Termination Policy:  WipeOut
Status:
  Cluster Def Generation:  2
  Components:
    Mongodb:
      Consensus Set Status:
        Leader:
          Access Mode:  ReadWrite
          Name:         primary
          Pod:          mongocluster-backup-mongodb-0
      Phase:            Abnormal
      Pods Ready:       true
      Pods Ready Time:  2023-08-08T04:01:12Z
  Conditions:
    Last Transition Time:  2023-08-08T03:56:24Z
    Message:               The operator has started the provisioning of Cluster: mongocluster-backup
    Observed Generation:   1
    Reason:                PreCheckSucceed
    Status:                True
    Type:                  ProvisioningStarted
    Last Transition Time:  2023-08-08T03:56:24Z
    Message:               Successfully applied for resources
    Observed Generation:   1
    Reason:                ApplyResourcesSucceed
    Status:                True
    Type:                  ApplyResources
    Last Transition Time:  2023-08-08T04:01:12Z
    Message:               all pods of components are ready, waiting for the probe detection successful
    Reason:                AllReplicasReady
    Status:                True
    Type:                  ReplicasReady
    Last Transition Time:  2023-08-08T03:56:25Z
    Message:               pods are unavailable in Components: [mongodb], refer to related component message in Cluster.status.components
    Reason:                ComponentsNotReady
    Status:                False
    Type:                  Ready
  Observed Generation:     1
  Phase:                   Abnormal
Events:
  Type     Reason                    Age                 From                Message
  ----     ------                    ----                ----                -------
  Normal   ComponentPhaseTransition  34m                 cluster-controller  Create a new component
  Normal   PreCheckSucceed           34m                 cluster-controller  The operator has started the provisioning of Cluster: mongocluster-backup
  Normal   ApplyResourcesSucceed     34m                 cluster-controller  Successfully applied for resources
  Warning  ReplicasNotReady          30m                 cluster-controller  pods are not ready in Components: [mongodb], refer to related component message in Cluster.status.components
  Normal   AllReplicasReady          29m (x2 over 31m)   cluster-controller  all pods of components are ready, waiting for the probe detection successful
  Normal   WaitingForProbeSuccess    26m (x20 over 31m)  cluster-controller  Waiting for probe success

➜  ~ kbcli cluster describe mongocluster-backup
Name: mongocluster-backup    Created Time: Aug 08,2023 11:56 UTC+0800
NAMESPACE   CLUSTER-DEFINITION   VERSION          STATUS     TERMINATION-POLICY
default     mongodb              mongodb-5.0.14   Abnormal   WipeOut

Endpoints:
COMPONENT   MODE        INTERNAL                                                      EXTERNAL
mongodb     ReadWrite   mongocluster-backup-mongodb.default.svc.cluster.local:27017   <none>

Topology:
COMPONENT   INSTANCE                        ROLE      STATUS    AZ              NODE                                                  CREATED-TIME
mongodb     mongocluster-backup-mongodb-0   primary   Running   us-central1-c   gke-yjtest-default-pool-8e798dc1-4z3z/10.128.15.226   Aug 08,2023 11:56 UTC+0800
mongodb     mongocluster-backup-mongodb-1   <none>    Running   us-central1-c   gke-yjtest-default-pool-8e798dc1-2pmb/10.128.15.228   Aug 08,2023 11:56 UTC+0800
mongodb     mongocluster-backup-mongodb-2   <none>    Running   us-central1-c   gke-yjtest-default-pool-8e798dc1-hvxp/10.128.15.227   Aug 08,2023 11:56 UTC+0800

Resources Allocation:
COMPONENT   DEDICATED   CPU(REQUEST/LIMIT)   MEMORY(REQUEST/LIMIT)           STORAGE-SIZE   STORAGE-CLASS
mongodb     false       200m / 200m          858993459200m / 858993459200m   data:8Gi       kb-default-sc

Images:
COMPONENT   TYPE      IMAGE
mongodb     mongodb   registry.cn-hangzhou.aliyuncs.com/apecloud/mongo:5.0.14

Data Protection:
AUTO-BACKUP   BACKUP-SCHEDULE   TYPE     BACKUP-TTL   LAST-SCHEDULE   RECOVERABLE-TIME
Disabled      <none>            <none>   7d           <none>          <none>

Show cluster events: kbcli cluster list-events -n default mongocluster-backup
JashBook commented 1 year ago

The problem can be repeated

1. kbcli cluster create  mongo-ijtzvl             --termination-policy=Delete             --monitor=false --enable-all-logs=false --cluster-definition=mongodb --set cpu=100m,memory=0.5Gi,replicas=3,storage=6Gi  --namespace default
2. kbcli cluster backup mongo-ijtzvl --type snapshot
3. kbcli cluster restore mongo-ijtzvl-backup --backup backup-default-mongo-ijtzvl-xxx
JashBook commented 1 year ago

fix via https://github.com/apecloud/kubeblocks/commit/12ae1844c0f5d01c105f340b4e42be4c21dfad3f