apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.02k stars 165 forks source link

[BUG]VolumeExpansion ops of polardbx cn/cdc is always running #5274

Open ahjing99 opened 11 months ago

ahjing99 commented 11 months ago

➜ ~ kbcli version Kubernetes: v1.27.3-gke.100 KubeBlocks: 0.7.0-alpha.18 kbcli: 0.7.0-alpha.18

The ops status is always running, the pv size has changed but pvc size does not change

➜  ~ helm repo add kubeblocks https://apecloud.github.io/helm-charts
^@^@"kubeblocks" has been added to your repositories
➜  ~ helm upgrade --install polardbx kubeblocks/polardbx --version 0.7.0-alpha.18
Release "polardbx" does not exist. Installing it now.
NAME: polardbx
LAST DEPLOYED: Tue Sep 26 16:46:20 2023
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Thanks for installing PolarDB-X using KubeBlocks!

1. Run the following command to create your first PolarDB-X cluster:

kbcli cluster create pxc --cluster-definition polardbx


2. Port-forward service to localhost and connect to PolarDB-X cluster:

kubectl port-forward svc/pxc-cn 3306:3306 mysql -h127.0.0.1 -upolardbx_root

➜  ~ kbcli cluster create test  --cluster-definition=polardbx
Info: --cluster-version is not specified, ClusterVersion polardbx-v1.4.1 is applied by default
Cluster test created

➜  ~ kbcli cluster describe test
Name: test   Created Time: Sep 26,2023 16:47 UTC+0800
NAMESPACE   CLUSTER-DEFINITION   VERSION           STATUS     TERMINATION-POLICY
default     polardbx             polardbx-v1.4.1   Updating   Delete

Endpoints:
COMPONENT   MODE        INTERNAL                                  EXTERNAL
gms         ReadWrite   test-gms.default.svc.cluster.local:3306   <none>
                        test-gms.default.svc.cluster.local:9104
dn          ReadWrite   test-dn.default.svc.cluster.local:3306    <none>
cn          ReadWrite   test-cn.default.svc.cluster.local:3306    <none>
                        test-cn.default.svc.cluster.local:9104
cdc         ReadWrite   test-cdc.default.svc.cluster.local:3306   <none>
                        test-cdc.default.svc.cluster.local:9104

Topology:
COMPONENT   INSTANCE     ROLE       STATUS    AZ              NODE                                                CREATED-TIME
cdc         test-cdc-0   <none>     Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-hsgn/10.128.0.37   Sep 26,2023 16:47 UTC+0800
cn          test-cn-0    <none>     Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-hsgn/10.128.0.37   Sep 26,2023 16:47 UTC+0800
dn          test-dn-0    leader     Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-xqs0/10.128.0.39   Sep 26,2023 16:47 UTC+0800
dn          test-dn-1    follower   Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-pbkd/10.128.0.26   Sep 26,2023 16:47 UTC+0800
dn          test-dn-2    follower   Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-rxv2/10.128.0.38   Sep 26,2023 16:54 UTC+0800
gms         test-gms-0   leader     Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-xqs0/10.128.0.39   Sep 26,2023 16:47 UTC+0800
gms         test-gms-1   follower   Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-pbkd/10.128.0.26   Sep 26,2023 16:47 UTC+0800
gms         test-gms-2   follower   Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-rxv2/10.128.0.38   Sep 26,2023 16:47 UTC+0800

Resources Allocation:
COMPONENT   DEDICATED   CPU(REQUEST/LIMIT)   MEMORY(REQUEST/LIMIT)   STORAGE-SIZE   STORAGE-CLASS
gms         false       1 / 1                1Gi / 1Gi               data:21Gi      kb-default-sc
dn          false       1 / 1                1Gi / 1Gi               data:21Gi      kb-default-sc
cn          false       1 / 1                1Gi / 1Gi               data:21Gi      kb-default-sc
cdc         false       1 / 1                1Gi / 1Gi               data:20Gi      kb-default-sc

Images:
COMPONENT   TYPE   IMAGE
gms         gms    polardbx/polardbx-engine-2.0:latest
dn          dn     polardbx/polardbx-engine-2.0:latest
cn          cn     polardbx/polardbx-sql:latest
cdc         cdc    polardbx/polardbx-cdc:latest

Show cluster events: kbcli cluster list-events -n default test
➜  ~ kbcli cluster describe test
Name: test   Created Time: Sep 26,2023 16:47 UTC+0800
NAMESPACE   CLUSTER-DEFINITION   VERSION           STATUS     TERMINATION-POLICY
default     polardbx             polardbx-v1.4.1   Updating   Delete

Endpoints:
COMPONENT   MODE        INTERNAL                                  EXTERNAL
gms         ReadWrite   test-gms.default.svc.cluster.local:3306   <none>
                        test-gms.default.svc.cluster.local:9104
dn          ReadWrite   test-dn.default.svc.cluster.local:3306    <none>
cn          ReadWrite   test-cn.default.svc.cluster.local:3306    <none>
                        test-cn.default.svc.cluster.local:9104
cdc         ReadWrite   test-cdc.default.svc.cluster.local:3306   <none>
                        test-cdc.default.svc.cluster.local:9104

Topology:
COMPONENT   INSTANCE     ROLE       STATUS    AZ              NODE                                                CREATED-TIME
cdc         test-cdc-0   <none>     Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-hsgn/10.128.0.37   Sep 26,2023 16:47 UTC+0800
cn          test-cn-0    <none>     Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-hsgn/10.128.0.37   Sep 26,2023 16:47 UTC+0800
dn          test-dn-0    leader     Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-xqs0/10.128.0.39   Sep 26,2023 16:47 UTC+0800
dn          test-dn-1    follower   Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-pbkd/10.128.0.26   Sep 26,2023 16:47 UTC+0800
dn          test-dn-2    follower   Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-rxv2/10.128.0.38   Sep 26,2023 16:54 UTC+0800
gms         test-gms-0   leader     Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-xqs0/10.128.0.39   Sep 26,2023 16:47 UTC+0800
gms         test-gms-1   follower   Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-pbkd/10.128.0.26   Sep 26,2023 16:47 UTC+0800
gms         test-gms-2   follower   Running   us-central1-c   gke-yjtest-default-pool-16b6e83c-rxv2/10.128.0.38   Sep 26,2023 16:47 UTC+0800

Resources Allocation:
COMPONENT   DEDICATED   CPU(REQUEST/LIMIT)   MEMORY(REQUEST/LIMIT)   STORAGE-SIZE   STORAGE-CLASS
gms         false       1 / 1                1Gi / 1Gi               data:21Gi      kb-default-sc
dn          false       1 / 1                1Gi / 1Gi               data:21Gi      kb-default-sc
cn          false       1 / 1                1Gi / 1Gi               data:21Gi      kb-default-sc
cdc         false       1 / 1                1Gi / 1Gi               data:20Gi      kb-default-sc

Images:
COMPONENT   TYPE   IMAGE
gms         gms    polardbx/polardbx-engine-2.0:latest
dn          dn     polardbx/polardbx-engine-2.0:latest
cn          cn     polardbx/polardbx-sql:latest
cdc         cdc    polardbx/polardbx-cdc:latest

Show cluster events: kbcli cluster list-events -n default test

➜  ~ kbcli cluster volume-expand test  --components cn --volume-claim-templates data --storage 21Gi
Please type the name again(separate with white space when more than one): test
OpsRequest test-volumeexpansion-z5hcc created successfully, you can view the progress:

kbcli cluster describe-ops test-volumeexpansion-z5hcc -n default

➜  ~ kbcli cluster describe-ops test-volumeexpansion-z5hcc -n default
Spec:
  Name: test-volumeexpansion-z5hcc  NameSpace: default  Cluster: test   Type: VolumeExpansion

Command:
  kbcli cluster volume-expand test --components=cn --volume-claim-template-names=data --storage=21Gi --namespace=default

Last Configuration:
COMPONENT   VOLUME-CLAIM-TEMPLATE   STORAGE
cn          data                    20Gi

Status:
  Start Time:         Sep 26,2023 17:19 UTC+0800
  Duration:           6m55s
  Status:             Running
  Progress:           0/1
                      OBJECT-KEY                 STATUS       DURATION    MESSAGE
                      PVC/data-test-cn-0(data)   Processing   <Unknown>   Start expanding volume: PVC/data-test-cn-0 in Component: cn

Conditions:
LAST-TRANSITION-TIME         TYPE              REASON                         STATUS   MESSAGE
Sep 26,2023 17:19 UTC+0800   Progressing       OpsRequestProgressingStarted   True     Start to process the OpsRequest: test-volumeexpansion-z5hcc in Cluster: test
Sep 26,2023 17:19 UTC+0800   Validated         ValidateOpsRequestPassed       True     OpsRequest: test-volumeexpansion-z5hcc is validated
Sep 26,2023 17:19 UTC+0800   VolumeExpanding   VolumeExpansionStarted         True     Start to expand the volumes in Cluster: test

Warning Events: <none>

➜  ~ k describe pvc data-test-cn-0
Name:          data-test-cn-0
Namespace:     default
StorageClass:  kb-default-sc
Status:        Bound
Volume:        pvc-dafc3f90-eb97-4989-8ed0-4e928e84b0b5
Labels:        app.kubernetes.io/instance=test
               app.kubernetes.io/managed-by=kubeblocks
               app.kubernetes.io/name=polardbx
               apps.kubeblocks.io/component-name=cn
               apps.kubeblocks.io/vct-name=data
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
               volume.kubernetes.io/selected-node: gke-yjtest-default-pool-16b6e83c-hsgn
               volume.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      20Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       test-cn-0
Conditions:
  Type                      Status  LastProbeTime                     LastTransitionTime                Reason  Message
  ----                      ------  -----------------                 ------------------                ------  -------
  FileSystemResizePending   True    Mon, 01 Jan 0001 00:00:00 +0000   Tue, 26 Sep 2023 17:19:42 +0800           Waiting for user to (re-)start a pod to finish file system resize of volume on node.
Events:
  Type     Reason                    Age                From                                                                                              Message
  ----     ------                    ----               ----                                                                                              -------
  Normal   WaitForFirstConsumer      40m                persistentvolume-controller                                                                       waiting for first consumer to be created before binding
  Normal   ExternalProvisioning      40m (x2 over 40m)  persistentvolume-controller                                                                       waiting for a volume to be created, either by external provisioner "pd.csi.storage.gke.io" or manually created by system administrator
  Normal   Provisioning              40m                pd.csi.storage.gke.io_gke-e81b3d5226fe4e088e54-819b-0aa3-vm_8d2cefde-e802-44df-94b2-d1edb676b67a  External provisioner is provisioning volume for claim "default/data-test-cn-0"
  Normal   ProvisioningSucceeded     40m                pd.csi.storage.gke.io_gke-e81b3d5226fe4e088e54-819b-0aa3-vm_8d2cefde-e802-44df-94b2-d1edb676b67a  Successfully provisioned volume pvc-dafc3f90-eb97-4989-8ed0-4e928e84b0b5
  Warning  ExternalExpanding         7m40s              volume_expand                                                                                     waiting for an external controller to expand this PVC
  Normal   Resizing                  7m39s              external-resizer pd.csi.storage.gke.io                                                            External resizer is resizing volume pvc-dafc3f90-eb97-4989-8ed0-4e928e84b0b5
  Normal   FileSystemResizeRequired  7m32s              external-resizer pd.csi.storage.gke.io                                                            Require file system resize of volume on node

➜  ~ k describe pv pvc-dafc3f90-eb97-4989-8ed0-4e928e84b0b5
Name:              pvc-dafc3f90-eb97-4989-8ed0-4e928e84b0b5
Labels:            <none>
Annotations:       pv.kubernetes.io/provisioned-by: pd.csi.storage.gke.io
                   volume.kubernetes.io/provisioner-deletion-secret-name:
                   volume.kubernetes.io/provisioner-deletion-secret-namespace:
Finalizers:        [kubernetes.io/pv-protection external-attacher/pd-csi-storage-gke-io]
StorageClass:      kb-default-sc
Status:            Bound
Claim:             default/data-test-cn-0
Reclaim Policy:    Delete
Access Modes:      RWO
VolumeMode:        Filesystem
Capacity:          21Gi
Node Affinity:
  Required Terms:
    Term 0:        topology.gke.io/zone in [us-central1-c]
Message:
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            pd.csi.storage.gke.io
    FSType:            xfs
    VolumeHandle:      projects/kubeblocks/zones/us-central1-c/disks/pvc-dafc3f90-eb97-4989-8ed0-4e928e84b0b5
    ReadOnly:          false
    VolumeAttributes:      storage.kubernetes.io/csiProvisionerIdentity=1695697785164-6118-pd.csi.storage.gke.io
Events:                <none>

➜ ~ kbcli report cluster test --with-logs --all-containers reporting cluster information to report-cluster-test-2023-09-26-17-31-59.zip processing manifests OK processing events OK process pod logs OK ➜ ~ kbcli report kubeblocks --with-logs --all-containers --output yaml reporting KubeBlocks information to report-kubeblocks-2023-09-26-17-33-00.zip processing manifests OK processing events OK process pod logs report-kubeblocks-2023-09-26-17-33-00.zip report-cluster-test-2023-09-26-17-31-59.zip

github-actions[bot] commented 10 months ago

This issue has been marked as stale because it has been open for 30 days with no activity