apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.1k stars 170 forks source link

[BUG]Mysql Hscale failed after upgrade kb from 053 to 060 #4586

Closed ahjing99 closed 1 year ago

ahjing99 commented 1 year ago
  1. Create cluster on 053
    
    kubectl apply -f -<<EOF
    apiVersion: v1
    kind: ServiceAccount
    metadata:
    labels:
    app.kubernetes.io/instance: dbname
    app.kubernetes.io/managed-by: kbcli
    name: dbname

apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: labels: app.kubernetes.io/instance: dbname app.kubernetes.io/managed-by: kbcli name: dbname rules:


apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: app.kubernetes.io/instance: dbname app.kubernetes.io/managed-by: kbcli name: dbname roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: dbname subjects:

k create -f mysql.yaml

➜ ~ cat mysql.yaml apiVersion: apps.kubeblocks.io/v1alpha1 kind: Cluster metadata: labels: clusterdefinition.kubeblocks.io/name: apecloud-mysql clusterversion.kubeblocks.io/name: ac-mysql-8.0.30 generateName: mysql- namespace: default spec: affinity: nodeLabels: {} podAntiAffinity: Preferred tenancy: SharedNode topologyKeys: [] clusterDefinitionRef: apecloud-mysql clusterVersionRef: ac-mysql-8.0.30 componentSpecs:

2. Upgrade kb to 0.6.0-beta.21
3. Hscale cluster failed

kbcli cluster hscale mysql-kjst5 --auto-approve --components mysql --replicas 5 --namespace default

OpsRequest mysql-kjst5-horizontalscaling-bjhdq created successfully, you can view the progress: kbcli cluster describe-ops mysql-kjst5-horizontalscaling-bjhdq -n default

➜ ~ kbcli cluster describe-ops mysql-kjst5-horizontalscaling-bjhdq -n default Spec: Name: mysql-kjst5-horizontalscaling-bjhdq NameSpace: default Cluster: mysql-kjst5 Type: HorizontalScaling

Command: kbcli cluster hscale mysql-kjst5 --components=mysql --replicas=5 --namespace=default

Last Configuration: COMPONENT REPLICAS mysql 2

Status: Start Time: Aug 02,2023 15:06 UTC+0800 Duration: 9m28s Status: Running Progress: 0/3 OBJECT-KEY STATUS DURATION MESSAGE

Conditions: LAST-TRANSITION-TIME TYPE REASON STATUS MESSAGE Aug 02,2023 15:06 UTC+0800 Progressing OpsRequestProgressingStarted True Start to process the OpsRequest: mysql-kjst5-horizontalscaling-bjhdq in Cluster: mysql-kjst5 Aug 02,2023 15:06 UTC+0800 Validated ValidateOpsRequestPassed True OpsRequest: mysql-kjst5-horizontalscaling-bjhdq is validated Aug 02,2023 15:06 UTC+0800 HorizontalScaling HorizontalScalingStarted True Start to horizontal scale replicas in Cluster: mysql-kjst5

Warning Events:

➜ ~ k get pod | grep mysql mysql-kjst5-mysql-0 5/5 Running 0 13m mysql-kjst5-mysql-1 5/5 Running 0 10m mysql-kjst5-mysql-2 0/5 ContainerCreating 0 9m23s mysql-kjst5-mysql-3 0/5 ContainerCreating 0 9m23s mysql-kjst5-mysql-4 0/5 ContainerCreating 0 9m23s

➜ ~ k describe pod mysql-kjst5-mysql-2 Name: mysql-kjst5-mysql-2 Namespace: default Priority: 0 Node: gke-yjtest-default-pool-8e798dc1-4z3z/10.128.15.226 Start Time: Wed, 02 Aug 2023 15:07:00 +0800 Labels: app.kubernetes.io/component=mysql app.kubernetes.io/instance=mysql-kjst5 app.kubernetes.io/managed-by=kubeblocks app.kubernetes.io/name=apecloud-mysql app.kubernetes.io/version=ac-mysql-8.0.30 apps.kubeblocks.io/component-name=mysql apps.kubeblocks.io/workload-type=Consensus controller-revision-hash=mysql-kjst5-mysql-787f9d7c54 statefulset.kubernetes.io/pod-name=mysql-kjst5-mysql-2 Annotations: apps.kubeblocks.io/component-replicas: 5 config.kubeblocks.io/restart-mysql-consensusset-config: 79f7655cc6 kubeblocks.io/restart: 2023-08-02T06:59:12Z Status: Pending IP: IPs: Controlled By: StatefulSet/mysql-kjst5-mysql Containers: mysql: Container ID: Image: registry.cn-hangzhou.aliyuncs.com/apecloud/apecloud-mysql-server:8.0.30-5.alpha9.20230606.gf80d546.9 Image ID: Ports: 3306/TCP, 13306/TCP Host Ports: 0/TCP, 0/TCP Command: /scripts/setup.sh State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Limits: cpu: 200m memory: 322122547200m Requests: cpu: 200m memory: 322122547200m Environment Variables from: mysql-kjst5-mysql-env ConfigMap Optional: false Environment: KB_POD_NAME: mysql-kjst5-mysql-2 (v1:metadata.name) KB_POD_UID: (v1:metadata.uid) KB_NAMESPACE: default (v1:metadata.namespace) KB_SA_NAME: (v1:spec.serviceAccountName) KB_NODENAME: (v1:spec.nodeName) KB_HOST_IP: (v1:status.hostIP) KB_POD_IP: (v1:status.podIP) KB_POD_IPS: (v1:status.podIPs) KB_HOSTIP: (v1:status.hostIP) KB_PODIP: (v1:status.podIP) KB_PODIPS: (v1:status.podIPs) KB_CLUSTER_NAME: mysql-kjst5 KB_COMP_NAME: mysql KB_CLUSTER_COMP_NAME: mysql-kjst5-mysql KB_CLUSTER_UID_POSTFIX_8: 39defef5 KB_POD_FQDN: $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc MYSQL_ROOT_HOST: % MYSQL_ROOT_USER: <set to the key 'username' in secret 'mysql-kjst5-conn-credential'> Optional: false MYSQL_ROOT_PASSWORD: <set to the key 'password' in secret 'mysql-kjst5-conn-credential'> Optional: false MYSQL_DATABASE: mydb MYSQL_USER: u1 MYSQL_PASSWORD: u1 CLUSTER_ID: 1 CLUSTER_START_INDEX: 1 REPLICATION_USER: replicator REPLICATION_PASSWORD: MYSQL_TEMPLATE_CONFIG: MYSQL_CUSTOM_CONFIG: MYSQL_DYNAMIC_CONFIG: KB_EMBEDDED_WESQL: 1 Mounts: /data/mysql from data (rw) /etc/annotations from annotations (rw) /opt/mysql from mysql-config (rw) /scripts from scripts (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mhsmk (ro) metrics: Container ID: Image: registry.cn-hangzhou.aliyuncs.com/apecloud/agamotto:0.1.2-beta.1 Image ID: Port: 9104/TCP Host Port: 0/TCP Command: /scripts/agamotto.sh State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Limits: cpu: 0 memory: 0 Requests: cpu: 0 memory: 0 Environment Variables from: mysql-kjst5-mysql-env ConfigMap Optional: false Environment: KB_POD_NAME: mysql-kjst5-mysql-2 (v1:metadata.name) KB_POD_UID: (v1:metadata.uid) KB_NAMESPACE: default (v1:metadata.namespace) KB_SA_NAME: (v1:spec.serviceAccountName) KB_NODENAME: (v1:spec.nodeName) KB_HOST_IP: (v1:status.hostIP) KB_POD_IP: (v1:status.podIP) KB_POD_IPS: (v1:status.podIPs) KB_HOSTIP: (v1:status.hostIP) KB_PODIP: (v1:status.podIP) KB_PODIPS: (v1:status.podIPs) KB_CLUSTER_NAME: mysql-kjst5 KB_COMP_NAME: mysql KB_CLUSTER_COMP_NAME: mysql-kjst5-mysql KB_CLUSTER_UID_POSTFIX_8: 39defef5 KB_POD_FQDN: $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc DB_TYPE: MySQL ENDPOINT: localhost:3306 MYSQL_USER: <set to the key 'username' in secret 'mysql-kjst5-conn-credential'> Optional: false MYSQL_PASSWORD: <set to the key 'password' in secret 'mysql-kjst5-conn-credential'> Optional: false Mounts: /data/mysql from data (rw) /opt/agamotto from agamotto-configuration (rw) /scripts from scripts (rw) /var/log/kubeblocks from log-data (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mhsmk (ro) vttablet: Container ID: Image: registry.cn-hangzhou.aliyuncs.com/apecloud/apecloud-mysql-scale:0.1.1 Image ID: Ports: 15100/TCP, 16100/TCP, 40000/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP Command: /scripts/vttablet.sh State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Limits: cpu: 0 memory: 0 Requests: cpu: 0 memory: 0 Environment Variables from: mysql-kjst5-mysql-env ConfigMap Optional: false Environment: KB_POD_NAME: mysql-kjst5-mysql-2 (v1:metadata.name) KB_POD_UID: (v1:metadata.uid) KB_NAMESPACE: default (v1:metadata.namespace) KB_SA_NAME: (v1:spec.serviceAccountName) KB_NODENAME: (v1:spec.nodeName) KB_HOST_IP: (v1:status.hostIP) KB_POD_IP: (v1:status.podIP) KB_POD_IPS: (v1:status.podIPs) KB_HOSTIP: (v1:status.hostIP) KB_PODIP: (v1:status.podIP) KB_PODIPS: (v1:status.podIPs) KB_CLUSTER_NAME: mysql-kjst5 KB_COMP_NAME: mysql KB_CLUSTER_COMP_NAME: mysql-kjst5-mysql KB_CLUSTER_UID_POSTFIX_8: 39defef5 KB_POD_FQDN: $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc CELL: zone1 ETCD_SERVER: $(KB_CLUSTER_NAME)-vtcontroller-headless ETCD_PORT: 2379 TOPOLOGY_FLAGS: --topo_implementation etcd2 --topo_global_server_address $(ETCD_SERVER):$(ETCD_PORT) --topo_global_root /vitess/global VTTABLET_PORT: 15100 VTTABLET_GRPC_PORT: 16100 VTCTLD_HOST: $(KB_CLUSTER_NAME)-vtcontroller-headless VTCTLD_WEB_PORT: 15000 MYSQL_ROOT_USER: <set to the key 'username' in secret 'mysql-kjst5-conn-credential'> Optional: false MYSQL_ROOT_PASSWORD: <set to the key 'password' in secret 'mysql-kjst5-conn-credential'> Optional: false Mounts: /conf from mysql-scale-config (rw) /scripts from scripts (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mhsmk (ro) /vtdataroot from data (rw) kb-checkrole: Container ID: Image: registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.6.0-beta.21 Image ID: Ports: 3501/TCP, 50001/TCP Host Ports: 0/TCP, 0/TCP Command: probe --app-id batch-sdk --dapr-http-port 3501 --dapr-grpc-port 50001 --log-level info --config /config/probe/config.yaml --components-path /config/probe/components State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Limits: cpu: 0 memory: 0 Requests: cpu: 0 memory: 0 Readiness: http-get http://:3501/v1.0/bindings/mysql%3Foperation=checkRole delay=0s timeout=1s period=1s #success=1 #failure=2 Startup: tcp-socket :3501 delay=0s timeout=1s period=10s #success=1 #failure=3 Environment Variables from: mysql-kjst5-mysql-env ConfigMap Optional: false Environment: KB_POD_NAME: mysql-kjst5-mysql-2 (v1:metadata.name) KB_POD_UID: (v1:metadata.uid) KB_NAMESPACE: default (v1:metadata.namespace) KB_SA_NAME: (v1:spec.serviceAccountName) KB_NODENAME: (v1:spec.nodeName) KB_HOST_IP: (v1:status.hostIP) KB_POD_IP: (v1:status.podIP) KB_POD_IPS: (v1:status.podIPs) KB_HOSTIP: (v1:status.hostIP) KB_PODIP: (v1:status.podIP) KB_PODIPS: (v1:status.podIPs) KB_CLUSTER_NAME: mysql-kjst5 KB_COMP_NAME: mysql KB_CLUSTER_COMP_NAME: mysql-kjst5-mysql KB_CLUSTER_UID_POSTFIX_8: 39defef5 KB_POD_FQDN: $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc KB_SERVICE_USER: <set to the key 'username' in secret 'mysql-kjst5-conn-credential'> Optional: false KB_SERVICE_PASSWORD: <set to the key 'password' in secret 'mysql-kjst5-conn-credential'> Optional: false KB_SERVICE_PORT: 3306 KB_SERVICE_ROLES: {"follower":"Readonly","leader":"ReadWrite","learner":"Readonly"} KB_SERVICE_CHARACTER_TYPE: mysql Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mhsmk (ro) config-manager: Container ID: Image: registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.6.0-beta.21 Image ID: Port: Host Port: Command: env Args: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:$(TOOLS_PATH) /bin/reloader --log-level info --operator-update-enable --tcp 9901 --config /opt/config-manager/config-manager.yaml State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Limits: cpu: 0 memory: 0 Requests: cpu: 0 memory: 0 Environment Variables from: mysql-kjst5-mysql-env ConfigMap Optional: false Environment: KB_POD_NAME: mysql-kjst5-mysql-2 (v1:metadata.name) KB_POD_UID: (v1:metadata.uid) KB_NAMESPACE: default (v1:metadata.namespace) KB_SA_NAME: (v1:spec.serviceAccountName) KB_NODENAME: (v1:spec.nodeName) KB_HOST_IP: (v1:status.hostIP) KB_POD_IP: (v1:status.podIP) KB_POD_IPS: (v1:status.podIPs) KB_HOSTIP: (v1:status.hostIP) KB_PODIP: (v1:status.podIP) KB_PODIPS: (v1:status.podIPs) KB_CLUSTER_NAME: mysql-kjst5 KB_COMP_NAME: mysql KB_CLUSTER_COMP_NAME: mysql-kjst5-mysql KB_CLUSTER_UID_POSTFIX_8: 39defef5 KB_POD_FQDN: $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc CONFIG_MANAGER_POD_IP: (v1:status.podIP) DB_TYPE: mysql MYSQL_USER: <set to the key 'username' in secret 'mysql-kjst5-conn-credential'> Optional: false MYSQL_PASSWORD: <set to the key 'password' in secret 'mysql-kjst5-conn-credential'> Optional: false DATA_SOURCE_NAME: $(MYSQL_USER):$(MYSQL_PASSWORD)@(localhost:3306)/ TOOLS_PATH: /opt/kb-tools/reload/mysql-consensusset-config:/opt/config-manager Mounts: /conf from mysql-scale-config (rw) /opt/config-manager from config-manager-config (rw) /opt/kb-tools/reload/mysql-consensusset-config from cm-script-mysql-consensusset-config (rw) /opt/mysql from mysql-config (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mhsmk (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: data: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: data-mysql-kjst5-mysql-2 ReadOnly: false log-data: Type: HostPath (bare host directory volume) Path: /var/log/kubeblocks HostPathType: DirectoryOrCreate annotations: Type: DownwardAPI (a volume populated by information about the pod) Items: metadata.annotations['cs.apps.kubeblocks.io/leader'] -> leader metadata.annotations['apps.kubeblocks.io/component-replicas'] -> component-replicas agamotto-configuration: Type: ConfigMap (a volume populated by a ConfigMap) Name: mysql-kjst5-mysql-agamotto-configuration Optional: false scripts: Type: ConfigMap (a volume populated by a ConfigMap) Name: mysql-kjst5-mysql-apecloud-mysql-scripts Optional: false mysql-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: mysql-kjst5-mysql-mysql-consensusset-config Optional: false mysql-scale-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: mysql-kjst5-mysql-vttablet-config Optional: false cm-script-mysql-consensusset-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: sidecar-mysql-reload-script-mysql-kjst5 Optional: false config-manager-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: sidecar-mysql-kjst5-mysql-config-manager-config Optional: false kube-api-access-mhsmk: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: Burstable Node-Selectors: Tolerations: kb-data=true:NoSchedule node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message


Normal Scheduled 9m12s default-scheduler Successfully assigned default/mysql-kjst5-mysql-2 to gke-yjtest-default-pool-8e798dc1-4z3z Normal SuccessfulAttachVolume 9m5s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-4a651ab1-e33b-4e0c-a6f8-c65ca708f4e0" Warning FailedMount 43s (x12 over 9m3s) kubelet MountVolume.MountDevice failed for volume "pvc-4a651ab1-e33b-4e0c-a6f8-c65ca708f4e0" : rpc error: code = Internal desc = Failed to format and mount device from ("/dev/disk/by-id/google-pvc-4a651ab1-e33b-4e0c-a6f8-c65ca708f4e0") to ("/var/lib/kubelet/plugins/kubernetes.io/csi/pd.csi.storage.gke.io/68d4a59218a3e3ff2872885ad3f67fb320859467c5370fe70b09b307bc60c25e/globalmount") with fstype ("xfs") and options (["nouuid"]): mount failed: exit status 32 Mounting command: mount Mounting arguments: -t xfs -o nouuid,defaults /dev/disk/by-id/google-pvc-4a651ab1-e33b-4e0c-a6f8-c65ca708f4e0 /var/lib/kubelet/plugins/kubernetes.io/csi/pd.csi.storage.gke.io/68d4a59218a3e3ff2872885ad3f67fb320859467c5370fe70b09b307bc60c25e/globalmount Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pd.csi.storage.gke.io/68d4a59218a3e3ff2872885ad3f67fb320859467c5370fe70b09b307bc60c25e/globalmount: wrong fs type, bad option, bad superblock on /dev/sdh, missing codepage or helper program, or other error. Warning FailedMount 23s (x4 over 7m9s) kubelet Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition ➜ ~

➜ ~ k logs mysql-kjst5-mysql-2 Defaulted container "mysql" out of: mysql, metrics, vttablet, kb-checkrole, config-manager Error from server (BadRequest): container "mysql" in pod "mysql-kjst5-mysql-2" is waiting to start: ContainerCreating

➜ ~ k get pvc | grep mysql data-mysql-kjst5-mysql-0 Bound pvc-a89a0800-d392-49d0-a6e2-a78853e2e5f3 1Gi RWO standard-rwo 153m data-mysql-kjst5-mysql-1 Bound pvc-de10db8c-b57a-453c-b52d-3955526cdd15 1Gi RWO standard-rwo 153m data-mysql-kjst5-mysql-2 Bound pvc-4a651ab1-e33b-4e0c-a6f8-c65ca708f4e0 1Gi RWO kb-default-sc 10m data-mysql-kjst5-mysql-3 Bound pvc-683281bd-1565-45e3-bc9f-04a4b1b3628f 1Gi RWO kb-default-sc 10m data-mysql-kjst5-mysql-4 Bound pvc-e2195452-c8d8-4528-83aa-c99422114288 1Gi RWO kb-default-sc 10m

ahjing99 commented 1 year ago

mongo restore also has the same error with kb upgrade from 53 to 60

    `kbcli cluster backup mongo-qb8xr --type snapshot --namespace default `

Backup backup-default-mongo-qb8xr-20230802170108 created successfully, you can view the progress:

 `kbcli cluster restore mongo-qb8xr-backup --backup backup-default-mongo-qb8xr-20230802170108 --namespace default `

Cluster mongo-qb8xr-backup created

➜  ~ k describe pod mongo-qb8xr-backup-mongodb-0
Name:           mongo-qb8xr-backup-mongodb-0
Namespace:      default
Priority:       0
Node:           gke-yjtest-default-pool-8e798dc1-2pmb/10.128.15.228
Start Time:     Wed, 02 Aug 2023 17:02:28 +0800
Labels:         app.kubernetes.io/component=mongodb
                app.kubernetes.io/instance=mongo-qb8xr-backup
                app.kubernetes.io/managed-by=kubeblocks
                app.kubernetes.io/name=mongodb
                app.kubernetes.io/version=mongodb-5.0.14
                apps.kubeblocks.io/component-name=mongodb
                apps.kubeblocks.io/workload-type=Consensus
                controller-revision-hash=mongo-qb8xr-backup-mongodb-d6fc6d8c9
                statefulset.kubernetes.io/pod-name=mongo-qb8xr-backup-mongodb-0
Annotations:    apps.kubeblocks.io/component-replicas: 3
                cs.apps.kubeblocks.io/leader:
Status:         Pending
IP:
IPs:            <none>
Controlled By:  StatefulSet/mongo-qb8xr-backup-mongodb
Containers:
  mongodb:
    Container ID:
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/mongo:5.0.14
    Image ID:
    Port:          27017/TCP
    Host Port:     0/TCP
    Command:
      /scripts/replicaset-setup.sh
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     200m
      memory:  322122547200m
    Requests:
      cpu:     200m
      memory:  322122547200m
    Environment Variables from:
      mongo-qb8xr-backup-mongodb-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:               mongo-qb8xr-backup-mongodb-0 (v1:metadata.name)
      KB_POD_UID:                 (v1:metadata.uid)
      KB_NAMESPACE:              default (v1:metadata.namespace)
      KB_SA_NAME:                 (v1:spec.serviceAccountName)
      KB_NODENAME:                (v1:spec.nodeName)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_IP:                  (v1:status.podIP)
      KB_POD_IPS:                 (v1:status.podIPs)
      KB_HOSTIP:                  (v1:status.hostIP)
      KB_PODIP:                   (v1:status.podIP)
      KB_PODIPS:                  (v1:status.podIPs)
      KB_CLUSTER_NAME:           mongo-qb8xr-backup
      KB_COMP_NAME:              mongodb
      KB_CLUSTER_COMP_NAME:      mongo-qb8xr-backup-mongodb
      KB_CLUSTER_UID_POSTFIX_8:  93553eeb
      KB_POD_FQDN:               $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      MONGODB_ROOT_USER:         <set to the key 'username' in secret 'mongo-qb8xr-backup-conn-credential'>  Optional: false
      MONGODB_ROOT_PASSWORD:     <set to the key 'password' in secret 'mongo-qb8xr-backup-conn-credential'>  Optional: false
    Mounts:
      /data/mongodb from data (rw)
      /etc/mongodb/keyfile from mongodb-config (rw,path="keyfile")
      /etc/mongodb/mongodb.conf from mongodb-config (rw,path="mongodb.conf")
      /scripts from scripts (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ds7pn (ro)
  metrics:
    Container ID:
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/agamotto:0.1.2-beta.1
    Image ID:
    Port:          9216/TCP
    Host Port:     0/TCP
    Command:
      /bin/agamotto
      --config=/opt/conf/metrics-config.yaml
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      mongo-qb8xr-backup-mongodb-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:               mongo-qb8xr-backup-mongodb-0 (v1:metadata.name)
      KB_POD_UID:                 (v1:metadata.uid)
      KB_NAMESPACE:              default (v1:metadata.namespace)
      KB_SA_NAME:                 (v1:spec.serviceAccountName)
      KB_NODENAME:                (v1:spec.nodeName)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_IP:                  (v1:status.podIP)
      KB_POD_IPS:                 (v1:status.podIPs)
      KB_HOSTIP:                  (v1:status.hostIP)
      KB_PODIP:                   (v1:status.podIP)
      KB_PODIPS:                  (v1:status.podIPs)
      KB_CLUSTER_NAME:           mongo-qb8xr-backup
      KB_COMP_NAME:              mongodb
      KB_CLUSTER_COMP_NAME:      mongo-qb8xr-backup-mongodb
      KB_CLUSTER_UID_POSTFIX_8:  93553eeb
      KB_POD_FQDN:               $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      MONGODB_ROOT_USER:         <set to the key 'username' in secret 'mongo-qb8xr-backup-conn-credential'>  Optional: false
      MONGODB_ROOT_PASSWORD:     <set to the key 'password' in secret 'mongo-qb8xr-backup-conn-credential'>  Optional: false
    Mounts:
      /opt/conf from mongodb-metrics-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ds7pn (ro)
  kb-checkrole:
    Container ID:
    Image:         registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.6.0-beta.21
    Image ID:
    Ports:         3501/TCP, 50001/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      probe
      --app-id
      batch-sdk
      --dapr-http-port
      3501
      --dapr-grpc-port
      50001
      --log-level
      info
      --config
      /config/probe/config.yaml
      --components-path
      /config/probe/components
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:      0
      memory:   0
    Readiness:  http-get http://:3501/v1.0/bindings/mongodb%3Foperation=checkRole delay=0s timeout=1s period=2s #success=1 #failure=3
    Startup:    tcp-socket :3501 delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment Variables from:
      mongo-qb8xr-backup-mongodb-env  ConfigMap  Optional: false
    Environment:
      KB_POD_NAME:                mongo-qb8xr-backup-mongodb-0 (v1:metadata.name)
      KB_POD_UID:                  (v1:metadata.uid)
      KB_NAMESPACE:               default (v1:metadata.namespace)
      KB_SA_NAME:                  (v1:spec.serviceAccountName)
      KB_NODENAME:                 (v1:spec.nodeName)
      KB_HOST_IP:                  (v1:status.hostIP)
      KB_POD_IP:                   (v1:status.podIP)
      KB_POD_IPS:                  (v1:status.podIPs)
      KB_HOSTIP:                   (v1:status.hostIP)
      KB_PODIP:                    (v1:status.podIP)
      KB_PODIPS:                   (v1:status.podIPs)
      KB_CLUSTER_NAME:            mongo-qb8xr-backup
      KB_COMP_NAME:               mongodb
      KB_CLUSTER_COMP_NAME:       mongo-qb8xr-backup-mongodb
      KB_CLUSTER_UID_POSTFIX_8:   93553eeb
      KB_POD_FQDN:                $(KB_POD_NAME).$(KB_CLUSTER_COMP_NAME)-headless.$(KB_NAMESPACE).svc
      KB_SERVICE_USER:            <set to the key 'username' in secret 'mongo-qb8xr-backup-conn-credential'>  Optional: false
      KB_SERVICE_PASSWORD:        <set to the key 'password' in secret 'mongo-qb8xr-backup-conn-credential'>  Optional: false
      KB_SERVICE_PORT:            27017
      KB_SERVICE_ROLES:           {"primary":"ReadWrite","secondary":"Readonly"}
      KB_SERVICE_CHARACTER_TYPE:  mongodb
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ds7pn (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-mongo-qb8xr-backup-mongodb-0
    ReadOnly:   false
  mongodb-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      mongo-qb8xr-backup-mongodb-mongodb-config
    Optional:  false
  mongodb-metrics-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      mongo-qb8xr-backup-mongodb-mongodb-metrics-config-new
    Optional:  false
  scripts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      mongo-qb8xr-backup-mongodb-mongodb-scripts
    Optional:  false
  kube-api-access-ds7pn:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 kb-data=true:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Normal   Scheduled               30m                   default-scheduler        Successfully assigned default/mongo-qb8xr-backup-mongodb-0 to gke-yjtest-default-pool-8e798dc1-2pmb
  Normal   SuccessfulAttachVolume  30m                   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-49315883-2b83-4bcf-8d7d-adaa7a20d66d"
  Warning  FailedMount             7m42s (x10 over 28m)  kubelet                  Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition
  Warning  FailedMount             3m22s (x21 over 30m)  kubelet                  MountVolume.MountDevice failed for volume "pvc-49315883-2b83-4bcf-8d7d-adaa7a20d66d" : rpc error: code = Internal desc = Failed to format and mount device from ("/dev/disk/by-id/google-pvc-49315883-2b83-4bcf-8d7d-adaa7a20d66d") to ("/var/lib/kubelet/plugins/kubernetes.io/csi/pd.csi.storage.gke.io/66ce5cf6fc2312dfacea131d1774ff91fb67ce7ae7cdd7fac0188cd884f692a5/globalmount") with fstype ("xfs") and options (["nouuid"]): mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t xfs -o nouuid,defaults /dev/disk/by-id/google-pvc-49315883-2b83-4bcf-8d7d-adaa7a20d66d /var/lib/kubelet/plugins/kubernetes.io/csi/pd.csi.storage.gke.io/66ce5cf6fc2312dfacea131d1774ff91fb67ce7ae7cdd7fac0188cd884f692a5/globalmount
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pd.csi.storage.gke.io/66ce5cf6fc2312dfacea131d1774ff91fb67ce7ae7cdd7fac0188cd884f692a5/globalmount: wrong fs type, bad option, bad superblock on /dev/sdk, missing codepage or helper program, or other error.
➜  ~
zjx20 commented 1 year ago

The key error message is wrong fs type.

Events:
  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Normal   Scheduled               30m                   default-scheduler        Successfully assigned default/mongo-qb8xr-backup-mongodb-0 to gke-yjtest-default-pool-8e798dc1-2pmb
  Normal   SuccessfulAttachVolume  30m                   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-49315883-2b83-4bcf-8d7d-adaa7a20d66d"
  Warning  FailedMount             7m42s (x10 over 28m)  kubelet                  Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition
  Warning  FailedMount             3m22s (x21 over 30m)  kubelet                  MountVolume.MountDevice failed for volume "pvc-49315883-2b83-4bcf-8d7d-adaa7a20d66d" : rpc error: code = Internal desc = Failed to format and mount device from ("/dev/disk/by-id/google-pvc-49315883-2b83-4bcf-8d7d-adaa7a20d66d") to ("/var/lib/kubelet/plugins/kubernetes.io/csi/pd.csi.storage.gke.io/66ce5cf6fc2312dfacea131d1774ff91fb67ce7ae7cdd7fac0188cd884f692a5/globalmount") with fstype ("xfs") and options (["nouuid"]): mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t xfs -o nouuid,defaults /dev/disk/by-id/google-pvc-49315883-2b83-4bcf-8d7d-adaa7a20d66d /var/lib/kubelet/plugins/kubernetes.io/csi/pd.csi.storage.gke.io/66ce5cf6fc2312dfacea131d1774ff91fb67ce7ae7cdd7fac0188cd884f692a5/globalmount
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pd.csi.storage.gke.io/66ce5cf6fc2312dfacea131d1774ff91fb67ce7ae7cdd7fac0188cd884f692a5/globalmount: wrong fs type, bad option, bad superblock on /dev/sdk, missing codepage or helper program, or other error.

The surface problem is to restore an ext4 snapshot (the SC of the snapshot source is standard-rwo) to an xfs PV (SC is kb-default-sc), so it fails. The root cause is why kb-default-sc is using xfs.

➜  ~  k get sc standard-rwo -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    components.gke.io/layer: addon
    storageclass.kubernetes.io/is-default-class: "true"
  creationTimestamp: "2023-07-31T03:02:00Z"
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
    k8s-app: gcp-compute-persistent-disk-csi-driver
  name: standard-rwo
  resourceVersion: "772"
  uid: a481566f-60a9-46d5-9671-114846426573
parameters:
  type: pd-balanced
provisioner: pd.csi.storage.gke.io
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

➜  ~   k get sc kb-default-sc -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    meta.helm.sh/release-name: kubeblocks
    meta.helm.sh/release-namespace: kb-system
  creationTimestamp: "2023-08-02T05:04:28Z"
  labels:
    app.kubernetes.io/instance: kubeblocks
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kubeblocks
    app.kubernetes.io/version: 0.6.0-beta.21
    helm.sh/chart: kubeblocks-0.6.0-beta.21
  name: kb-default-sc
  resourceVersion: "1745840"
  uid: a5a94b68-bbbe-4148-9c23-bd6941df8383
parameters:
  csi.storage.k8s.io/fstype: xfs
  type: pd-balanced
provisioner: pd.csi.storage.gke.io
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
lynnleelhl commented 1 year ago

v0.6 added sc kb-default-sc and made it default, which means the sc will change if it was not explicitly specified, during h-scale newly created pvc chose new sc and caused the inconsistency. according to @ldming kb-default-sc only set on cloud version, the others are still compatible, so this problem can be solved later

lynnleelhl commented 1 year ago

close as it might not be a compatibility problem in current situation