apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.1k stars 170 forks source link

[BUG]KB CrashLoopBackOff and pg cluster ConditionsError after upgrade from 0.4.5 to 0.5.0 #3366

Closed ahjing99 closed 1 year ago

ahjing99 commented 1 year ago

related issue https://github.com/apecloud/kubeblocks/issues/3346

  1. Install kbcli 0.4.5, install kubeblocks 0.4.5, create mysql and pg cluster
    ➜  ~ kbcli cluster list
    NAME        NAMESPACE   CLUSTER-DEFINITION   VERSION             TERMINATION-POLICY   STATUS    CREATED-TIME
    mycluster   default     apecloud-mysql       ac-mysql-8.0.30     Delete               Running   May 22,2023 11:14 UTC+0800
    pgcluster   default     postgresql           postgresql-14.7.0   Delete               Running   May 22,2023 11:18 UTC+0800
  2. rm kbcli 0.4.5, install kbcli 0.5.0, upgrade kubeblocks to 0.5.0
    
    kbcli kubeblocks upgrade --version 0.5.0
    Current KubeBlocks version {0.4.5 v1.25.8-gke.500 0.5.0}.
    Kubernetes version 1.25.8
    Kubernetes provider GKE
    kbcli version 0.5.0
    Add and update repo kubeblocks                     OK
    Upgrading KubeBlocks to 0.5.0                      OK

KubeBlocks has been upgraded to 0.5.0 SUCCESSFULLY!

-> Basic commands for cluster: kbcli cluster create -h # help information about creating a database cluster kbcli cluster list # list all database clusters kbcli cluster describe # get cluster information

-> Uninstall KubeBlocks: kbcli kubeblocks uninstall

-> To view the monitoring add-ons web console: kbcli dashboard list # list all monitoring web consoles kbcli dashboard open # open the web console in the default browser

3. kubeblocks crash

➜ ~ k get pod NAME READY STATUS RESTARTS AGE csi-attacher-s3-0 1/1 Running 1 (10m ago) 10m csi-provisioner-s3-0 2/2 Running 0 10m csi-s3-mbs64 2/2 Running 0 10m csi-s3-rld5x 2/2 Running 0 10m csi-s3-rm2w8 2/2 Running 0 10m kb-addon-alertmanager-webhook-adaptor-b8df446b6-mk4jq 2/2 Running 0 34m kb-addon-grafana-847ffd849-t9c86 3/3 Running 0 34m kb-addon-prometheus-alertmanager-0 2/2 Running 0 34m kb-addon-prometheus-server-0 2/2 Running 0 34m kb-addon-snapshot-controller-65b6db596-mftsw 1/1 Running 0 32m kubeblocks-747d6ccd4f-m7fmf 0/1 CrashLoopBackOff 6 (4m4s ago) 12m mycluster-mysql-0 4/4 Running 0 24m pgcluster-pg-replication-0 3/3 Running 0 11m pgcluster-pg-replication-1 3/3 Running 0 11m

➜ ~ k describe pod kubeblocks-747d6ccd4f-m7fmf Name: kubeblocks-747d6ccd4f-m7fmf Namespace: default Priority: 0 Node: gke-yjtest-default-pool-f5a74fb6-1vz9/10.128.0.57 Start Time: Mon, 22 May 2023 11:26:40 +0800 Labels: app.kubernetes.io/instance=kubeblocks app.kubernetes.io/name=kubeblocks pod-template-hash=747d6ccd4f Annotations: Status: Running IP: 10.104.0.16 IPs: IP: 10.104.0.16 Controlled By: ReplicaSet/kubeblocks-747d6ccd4f Init Containers: tools: Container ID: containerd://0f4c160e6b4844179e41c52eb9c702b4be6611d44f30520f6f6afba64c0b0b77 Image: registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.5.0 Image ID: registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools@sha256:d8b418b6780c690884afbfffa473515dc6795e9d187806daf53702b2afdc2bcc Port: Host Port: Command: /bin/true State: Terminated Reason: Completed Exit Code: 0 Started: Mon, 22 May 2023 11:27:11 +0800 Finished: Mon, 22 May 2023 11:27:11 +0800 Ready: True Restart Count: 0 Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-7k59m (ro) Containers: manager: Container ID: containerd://8f3c4849ab3bbd066dafce34dd1eb0984d5e1b50b06bce0da2dbc75b787f4fa2 Image: registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks:0.5.0 Image ID: registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks@sha256:653ee4d3cf3d88aa27baea530d81cffaeb376e92b2976dd73f7fb0de12ca27cb Ports: 9443/TCP, 8081/TCP, 8080/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP Args: --health-probe-bind-address=:8081 --metrics-bind-address=:8080 --leader-elect --zap-devel=false --zap-time-encoding=iso8601 --zap-encoder=console State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 2 Started: Mon, 22 May 2023 11:34:59 +0800 Finished: Mon, 22 May 2023 11:35:18 +0800 Ready: False Restart Count: 6 Liveness: http-get http://:health/healthz delay=15s timeout=1s period=20s #success=1 #failure=3 Readiness: http-get http://:health/readyz delay=5s timeout=1s period=10s #success=1 #failure=3 Environment: CM_NAMESPACE: default CM_AFFINITY: {"nodeAffinity":{"preferredDuringSchedulingIgnoredDuringExecution":[{"preference":{"matchExpressions":[{"key":"kb-controller","operator":"In","values":["true"]}]},"weight":100}]}} CM_TOLERATIONS: [{"effect":"NoSchedule","key":"kb-controller","operator":"Equal","value":"true"}] KUBEBLOCKS_IMAGE_PULL_POLICY: IfNotPresent KUBEBLOCKS_TOOLS_IMAGE: registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.5.0 KUBEBLOCKS_SERVICEACCOUNT_NAME: kubeblocks VOLUMESNAPSHOT_API_BETA: true ADDON_JOB_TTL: ADDON_JOB_IMAGE_PULL_POLICY: IfNotPresent KUBEBLOCKS_ADDON_SA_NAME: kubeblocks-addon-installer Mounts: /etc/kubeblocks from manager-config (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-7k59m (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: manager-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: kubeblocks-manager-config Optional: false kube-api-access-7k59m: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: Tolerations: kb-controller=true:NoSchedule node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message


Normal Scheduled 13m default-scheduler Successfully assigned default/kubeblocks-747d6ccd4f-m7fmf to gke-yjtest-default-pool-f5a74fb6-1vz9 Normal Pulling 13m kubelet Pulling image "registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.5.0" Normal Pulled 12m kubelet Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.5.0" in 30.375110024s (30.375126493s including waiting) Normal Created 12m kubelet Created container tools Normal Started 12m kubelet Started container tools Normal Pulling 12m kubelet Pulling image "registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks:0.5.0" Normal Pulled 12m kubelet Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks:0.5.0" in 17.522728091s (17.522736828s including waiting) Normal Created 10m (x4 over 12m) kubelet Created container manager Normal Started 10m (x4 over 12m) kubelet Started container manager Normal Pulled 10m (x3 over 12m) kubelet Container image "registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks:0.5.0" already present on machine Warning BackOff 2m59s (x42 over 11m) kubelet Back-off restarting failed container

➜ ~ k logs kubeblocks-747d6ccd4f-m7fmf Defaulted container "manager" out of: manager, tools (init) 2023-05-22T03:34:59.521Z INFO setup config file: /etc/kubeblocks/config.yaml 2023-05-22T03:34:59.522Z INFO setup config settings: map[alsologtostderr:false backup_pv_configmap_name: backup_pv_configmap_namespace: backup_pvc_create_policy: backup_pvc_init_capacity: backup_pvc_name: backup_pvc_storage_class: cert_dir:/tmp/k8s-webhook-server/serving-certs cm_namespace:default cm_recon_retry_duration_ms:100 config_manager_grpc_port:9901 config_manager_log_level:info enable_debug_sysaccounts:false health_probe_bind_address::8081 kill_container_signal:SIGKILL kubeblocks_addon_helm_install_options:[--atomic --cleanup-on-fail --wait] kubeblocks_addon_helm_uninstall_options:[] kubeblocks_addon_sa_name:kubeblocks-addon-installer kubeblocks_serviceaccount_name:kubeblocks kubeblocks_tools_image:registry.cn-hangzhou.aliyuncs.com/apecloud/kubeblocks-tools:0.5.0 kubeconfig: leader_elect:true log_backtrace_at::0 log_dir: logtostderr:false maxconcurrentreconciles_addon:8 maxconcurrentreconciles_clusterdef:8 maxconcurrentreconciles_clusterversion:8 maxconcurrentreconciles_dataprotection:8 metrics_bind_address::8080 pod_min_ready_seconds:10 probe_service_grpc_port:50001 probe_service_http_port:3501 probe_service_log_level:info stderrthreshold:2 v:0 vmodule: volumesnapshot:false volumesnapshot_api_beta:true zap_devel:false zap_encoder:console zap_log_level: zap_stacktrace_level: zap_time_encoding:iso8601] 2023-05-22T03:35:00.376Z INFO controller-runtime.metrics Metrics server is starting to listen {"addr": ":8080"} 2023-05-22T03:35:00.378Z INFO setup starting manager 2023-05-22T03:35:00.379Z INFO Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::]:8080"} 2023-05-22T03:35:00.379Z INFO Starting server {"kind": "health probe", "addr": "[::]:8081"} I0522 03:35:00.379108 1 leaderelection.go:248] attempting to acquire leader lease default/001c317f.kubeblocks.io... I0522 03:35:17.859085 1 leaderelection.go:258] successfully acquired lease default/001c317f.kubeblocks.io 2023-05-22T03:35:17.859Z INFO Starting EventSource {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "source": "kind source: v1alpha1.Cluster"} 2023-05-22T03:35:17.859Z INFO Starting EventSource {"controller": "clusterdefinition", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "ClusterDefinition", "source": "kind source: v1alpha1.ClusterDefinition"} 2023-05-22T03:35:17.859Z INFO Starting Controller {"controller": "clusterdefinition", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "ClusterDefinition"} 2023-05-22T03:35:17.859Z INFO Starting EventSource {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "source": "kind source: v1.StatefulSet"} 2023-05-22T03:35:17.859Z INFO Starting EventSource {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "source": "kind source: v1.Deployment"} 2023-05-22T03:35:17.859Z INFO Starting EventSource {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "source": "kind source: v1.Service"} 2023-05-22T03:35:17.859Z INFO Starting EventSource {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "source": "kind source: v1.Secret"} 2023-05-22T03:35:17.859Z INFO Starting EventSource {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "source": "kind source: v1.ConfigMap"} 2023-05-22T03:35:17.859Z INFO Starting EventSource {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "source": "kind source: v1.PersistentVolumeClaim"} 2023-05-22T03:35:17.859Z INFO Starting EventSource {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "source": "kind source: v1.PodDisruptionBudget"} 2023-05-22T03:35:17.859Z INFO Starting EventSource {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "source": "kind source: v1alpha1.BackupPolicy"} 2023-05-22T03:35:17.859Z INFO Starting EventSource {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "source": "kind source: v1alpha1.Backup"} 2023-05-22T03:35:17.859Z INFO Starting Controller {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster"} 2023-05-22T03:35:17.859Z INFO Starting EventSource {"controller": "backuppolicy", "controllerGroup": "dataprotection.kubeblocks.io", "controllerKind": "BackupPolicy", "source": "kind source: v1alpha1.BackupPolicy"} 2023-05-22T03:35:17.860Z INFO Starting Controller {"controller": "backuppolicy", "controllerGroup": "dataprotection.kubeblocks.io", "controllerKind": "BackupPolicy"} 2023-05-22T03:35:17.860Z INFO Starting EventSource {"controller": "backuptool", "controllerGroup": "dataprotection.kubeblocks.io", "controllerKind": "BackupTool", "source": "kind source: v1alpha1.BackupTool"} 2023-05-22T03:35:17.860Z INFO Starting Controller {"controller": "backuptool", "controllerGroup": "dataprotection.kubeblocks.io", "controllerKind": "BackupTool"} 2023-05-22T03:35:17.860Z INFO Starting EventSource {"controller": "backup", "controllerGroup": "dataprotection.kubeblocks.io", "controllerKind": "Backup", "source": "kind source: v1alpha1.Backup"} 2023-05-22T03:35:17.860Z INFO Starting EventSource {"controller": "cronjob", "controllerGroup": "batch", "controllerKind": "CronJob", "source": "kind source: v1.CronJob"} 2023-05-22T03:35:17.860Z INFO Starting EventSource {"controller": "cronjob", "controllerGroup": "batch", "controllerKind": "CronJob", "source": "kind source: v1.Job"} 2023-05-22T03:35:17.860Z INFO Starting EventSource {"controller": "backup", "controllerGroup": "dataprotection.kubeblocks.io", "controllerKind": "Backup", "source": "kind source: v1.Job"} 2023-05-22T03:35:17.860Z INFO Starting Controller {"controller": "cronjob", "controllerGroup": "batch", "controllerKind": "CronJob"} 2023-05-22T03:35:17.860Z INFO Starting Controller {"controller": "backup", "controllerGroup": "dataprotection.kubeblocks.io", "controllerKind": "Backup"} 2023-05-22T03:35:17.860Z INFO Starting EventSource {"controller": "opsrequest", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "OpsRequest", "source": "kind source: v1alpha1.OpsRequest"} 2023-05-22T03:35:17.860Z INFO Starting Controller {"controller": "opsrequest", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "OpsRequest"} 2023-05-22T03:35:17.860Z INFO Starting EventSource {"controller": "restorejob", "controllerGroup": "dataprotection.kubeblocks.io", "controllerKind": "RestoreJob", "source": "kind source: v1alpha1.RestoreJob"} 2023-05-22T03:35:17.860Z INFO Starting Controller {"controller": "restorejob", "controllerGroup": "dataprotection.kubeblocks.io", "controllerKind": "RestoreJob"} 2023-05-22T03:35:17.860Z INFO Starting EventSource {"controller": "clusterversion", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "ClusterVersion", "source": "kind source: v1alpha1.ClusterVersion"} 2023-05-22T03:35:17.860Z INFO Starting Controller {"controller": "clusterversion", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "ClusterVersion"} 2023-05-22T03:35:17.860Z INFO Starting EventSource {"controller": "addon", "controllerGroup": "extensions.kubeblocks.io", "controllerKind": "Addon", "source": "kind source: v1alpha1.Addon"} 2023-05-22T03:35:17.860Z INFO Starting EventSource {"controller": "addon", "controllerGroup": "extensions.kubeblocks.io", "controllerKind": "Addon", "source": "kind source: v1.Job"} 2023-05-22T03:35:17.860Z INFO Starting Controller {"controller": "addon", "controllerGroup": "extensions.kubeblocks.io", "controllerKind": "Addon"} 2023-05-22T03:35:17.860Z INFO Starting EventSource {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "source": "kind source: v1alpha1.Cluster"} 2023-05-22T03:35:17.860Z INFO Starting EventSource {"controller": "configconstraint", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "ConfigConstraint", "source": "kind source: v1alpha1.ConfigConstraint"} 2023-05-22T03:35:17.860Z INFO Starting EventSource {"controller": "configconstraint", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "ConfigConstraint", "source": "kind source: v1.ConfigMap"} 2023-05-22T03:35:17.860Z INFO Starting Controller {"controller": "configconstraint", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "ConfigConstraint"} 2023-05-22T03:35:17.861Z INFO Starting EventSource {"controller": "configmap", "controllerGroup": "", "controllerKind": "ConfigMap", "source": "kind source: v1.ConfigMap"} 2023-05-22T03:35:17.861Z INFO Starting Controller {"controller": "configmap", "controllerGroup": "", "controllerKind": "ConfigMap"} 2023-05-22T03:35:17.861Z INFO Starting EventSource {"controller": "persistentvolumeclaim", "controllerGroup": "", "controllerKind": "PersistentVolumeClaim", "source": "kind source: v1.PersistentVolumeClaim"} 2023-05-22T03:35:17.861Z INFO Starting Controller {"controller": "persistentvolumeclaim", "controllerGroup": "", "controllerKind": "PersistentVolumeClaim"} 2023-05-22T03:35:17.861Z INFO Starting EventSource {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "source": "kind source: v1.Event"} 2023-05-22T03:35:17.861Z INFO Starting Controller {"controller": "event", "controllerGroup": "", "controllerKind": "Event"} 2023-05-22T03:35:17.861Z INFO Starting EventSource {"controller": "deployment-watcher", "controllerGroup": "apps", "controllerKind": "Deployment", "source": "kind source: v1.Deployment"} 2023-05-22T03:35:17.861Z INFO Starting EventSource {"controller": "deployment-watcher", "controllerGroup": "apps", "controllerKind": "Deployment", "source": "kind source: v1.ReplicaSet"} 2023-05-22T03:35:17.861Z INFO Starting EventSource {"controller": "deployment-watcher", "controllerGroup": "apps", "controllerKind": "Deployment", "source": "kind source: v1.Pod"} 2023-05-22T03:35:17.861Z INFO Starting Controller {"controller": "deployment-watcher", "controllerGroup": "apps", "controllerKind": "Deployment"} 2023-05-22T03:35:17.861Z INFO Starting EventSource {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "source": "kind source: v1.Secret"} 2023-05-22T03:35:17.861Z INFO Starting EventSource {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "source": "kind source: v1.Job"} 2023-05-22T03:35:17.861Z INFO Starting Controller {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster"} 2023-05-22T03:35:17.861Z INFO Starting EventSource {"controller": "statefulset-watcher", "controllerGroup": "apps", "controllerKind": "StatefulSet", "source": "kind source: v1.StatefulSet"} 2023-05-22T03:35:17.861Z INFO Starting EventSource {"controller": "statefulset-watcher", "controllerGroup": "apps", "controllerKind": "StatefulSet", "source": "kind source: v1.Pod"} 2023-05-22T03:35:17.861Z INFO Starting Controller {"controller": "statefulset-watcher", "controllerGroup": "apps", "controllerKind": "StatefulSet"} 2023-05-22T03:35:17.862Z INFO Starting EventSource {"controller": "componentclassdefinition", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "ComponentClassDefinition", "source": "kind source: v1alpha1.ComponentClassDefinition"} 2023-05-22T03:35:17.862Z INFO Starting Controller {"controller": "componentclassdefinition", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "ComponentClassDefinition"} 2023-05-22T03:35:18.061Z INFO Starting workers {"controller": "backuptool", "controllerGroup": "dataprotection.kubeblocks.io", "controllerKind": "BackupTool", "worker count": 8} 2023-05-22T03:35:18.061Z INFO Starting workers {"controller": "configmap", "controllerGroup": "", "controllerKind": "ConfigMap", "worker count": 1} 2023-05-22T03:35:18.062Z INFO Starting workers {"controller": "backuppolicy", "controllerGroup": "dataprotection.kubeblocks.io", "controllerKind": "BackupPolicy", "worker count": 8} 2023-05-22T03:35:18.062Z INFO Starting workers {"controller": "persistentvolumeclaim", "controllerGroup": "", "controllerKind": "PersistentVolumeClaim", "worker count": 1} 2023-05-22T03:35:18.063Z INFO Starting workers {"controller": "event", "controllerGroup": "", "controllerKind": "Event", "worker count": 1} 2023-05-22T03:35:18.064Z INFO Starting workers {"controller": "backup", "controllerGroup": "dataprotection.kubeblocks.io", "controllerKind": "Backup", "worker count": 8} 2023-05-22T03:35:18.064Z INFO Starting workers {"controller": "statefulset-watcher", "controllerGroup": "apps", "controllerKind": "StatefulSet", "worker count": 1} 2023-05-22T03:35:18.064Z INFO Starting workers {"controller": "addon", "controllerGroup": "extensions.kubeblocks.io", "controllerKind": "Addon", "worker count": 8} 2023-05-22T03:35:18.064Z INFO Starting workers {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "worker count": 1} 2023-05-22T03:35:18.064Z INFO Starting workers {"controller": "deployment-watcher", "controllerGroup": "apps", "controllerKind": "Deployment", "worker count": 1} 2023-05-22T03:35:18.064Z INFO Starting workers {"controller": "restorejob", "controllerGroup": "dataprotection.kubeblocks.io", "controllerKind": "RestoreJob", "worker count": 8} 2023-05-22T03:35:18.064Z INFO Starting workers {"controller": "cronjob", "controllerGroup": "batch", "controllerKind": "CronJob", "worker count": 1} 2023-05-22T03:35:18.065Z INFO Starting workers {"controller": "componentclassdefinition", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "ComponentClassDefinition", "worker count": 1} 2023-05-22T03:35:18.065Z INFO Starting workers {"controller": "opsrequest", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "OpsRequest", "worker count": 1} 2023-05-22T03:35:18.065Z INFO Starting workers {"controller": "clusterversion", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "ClusterVersion", "worker count": 8} 2023-05-22T03:35:18.066Z INFO Starting workers {"controller": "clusterdefinition", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "ClusterDefinition", "worker count": 8} 2023-05-22T03:35:18.066Z INFO Starting workers {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "worker count": 1} 2023-05-22T03:35:18.066Z INFO Starting workers {"controller": "configconstraint", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "ConfigConstraint", "worker count": 1} 2023-05-22T03:35:18.079Z INFO component status changed {"controller": "statefulset-watcher", "controllerGroup": "apps", "controllerKind": "StatefulSet", "StatefulSet": {"name":"mycluster-mysql","namespace":"default"}, "namespace": "default", "name": "mycluster-mysql", "reconcileID": "608aecd1-39a3-4696-acd1-40e9706c0543", "statefulSet": "default/mycluster-mysql", "componentName": "mysql", "phase": "Running", "componentIsRunning": true, "podsAreReady": true} 2023-05-22T03:35:18.143Z INFO DAG: |->{obj:*v1alpha1.Cluster, immutable: false, action: STATUS} {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "Cluster": {"name":"pgcluster","namespace":"default"}, "namespace": "default", "name": "pgcluster", "reconcileID": "261d6f86-ed00-4291-866e-16bcec642f3f", "cluster": "default/pgcluster"} 2023-05-22T03:35:18.163Z INFO Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference {"controller": "statefulset-watcher", "controllerGroup": "apps", "controllerKind": "StatefulSet", "StatefulSet": {"name":"pgcluster-pg-replication","namespace":"default"}, "namespace": "default", "name": "pgcluster-pg-replication", "reconcileID": "f4832cd4-f2c8-4f4c-ac32-0feb057286d7"} panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1bd2512]

goroutine 831 [running]: sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Reconcile.func1() /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:119 +0x1fa panic({0x1e21440, 0x342f590}) /usr/local/go/src/runtime/panic.go:884 +0x213 github.com/apecloud/kubeblocks/controllers/apps/components.workloadCompClusterReconcile({{0x240ae78, 0xc001f0d860}, {{{0xc000d66939, 0x7}, {0xc000611db8, 0x18}}}, {{0x2410268, 0xc001f0d8c0}, 0x0}, {0x2409bc0, ...}}, ...) /workspace/controllers/apps/components/component.go:151 +0x492 github.com/apecloud/kubeblocks/controllers/apps/components.(StatefulSetReconciler).Reconcile(0xc0004eb200, {0x240ae78, 0xc001f0d860}, {{{0xc000d66939, 0x7}, {0xc000611db8, 0x18}}}) /workspace/controllers/apps/components/stateful_set_controller.go:77 +0x3ea sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Reconcile(0x240ae78?, {0x240ae78?, 0xc001f0d860?}, {{{0xc000d66939?, 0x1d72220?}, {0xc000611db8?, 0x0?}}}) /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:122 +0xc8 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).reconcileHandler(0xc000670f00, {0x240add0, 0xc00047d590}, {0x1e9f600?, 0xc000bd6920?}) /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:323 +0x35f sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem(0xc000670f00, {0x240add0, 0xc00047d590}) /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:274 +0x1d9 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Start.func2.2() /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:235 +0x85 created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:231 +0x587

4. pg cluster ConditionsError

➜ ~ kbcli cluster list NAME NAMESPACE CLUSTER-DEFINITION VERSION TERMINATION-POLICY STATUS CREATED-TIME mycluster default apecloud-mysql ac-mysql-8.0.30 Delete Running May 22,2023 11:14 UTC+0800 pgcluster default postgresql postgresql-14.7.0 Delete ConditionsError May 22,2023 11:18 UTC+0800

➜ ~ kbcli cluster describe pgcluster Name: pgcluster Created Time: May 22,2023 11:18 UTC+0800 NAMESPACE CLUSTER-DEFINITION VERSION STATUS TERMINATION-POLICY default postgresql postgresql-14.7.0 ConditionsError Delete

Endpoints: COMPONENT MODE INTERNAL EXTERNAL pg-replication ReadWrite pgcluster-pg-replication.default.svc.cluster.local:5432 pgcluster-pg-replication.default.svc.cluster.local:9187

Topology: COMPONENT INSTANCE ROLE STATUS AZ NODE CREATED-TIME pg-replication pgcluster-pg-replication-0 primary Running us-central1-c gke-yjtest-default-pool-f5a74fb6-1vz9/10.128.0.57 May 22,2023 11:27 UTC+0800 pg-replication pgcluster-pg-replication-1 secondary Running us-central1-c gke-yjtest-default-pool-f5a74fb6-4cb0/10.128.0.58 May 22,2023 11:27 UTC+0800

Resources Allocation: COMPONENT DEDICATED CPU(REQUEST/LIMIT) MEMORY(REQUEST/LIMIT) STORAGE-SIZE STORAGE-CLASS pg-replication false 200m / 200m 644245094400m / 644245094400m data:1Gi standard-rwo

Images: COMPONENT TYPE IMAGE pg-replication pg-replication registry.cn-hangzhou.aliyuncs.com/apecloud/postgresql:14.7.0

Events(last 5 warnings, see more:kbcli cluster list-events -n default pgcluster): TIME TYPE REASON OBJECT MESSAGE May 22,2023 11:18 UTC+0800 Warning ApplyResourcesFailed Cluster/pgcluster pod number in statefulset pgcluster-pg-replication-1 is not 1 May 22,2023 11:27 UTC+0800 Warning ReplicasNotReady Cluster/pgcluster pods are not ready in Components: [pg-replication], refer to related component message in Cluster.status.components May 22,2023 11:27 UTC+0800 Warning ComponentsNotReady Cluster/pgcluster pods are unavailable in Components: [pg-replication], refer to related component message in Cluster.status.components May 22,2023 11:28 UTC+0800 Warning PreCheckFailed Cluster/pgcluster requeue after: 100ms as: ref resource is unavailable, this problem needs to be solved first. cd: postgresql, cv: postgresql-14.7.0 May 22,2023 11:28 UTC+0800 Warning PreCheckFailed Cluster/pgcluster ref resource is unavailable, this problem needs to be solved first. cd: postgresql, cv: postgresql-14.7.0

ahjing99 commented 1 year ago
➜  ~ k get cd
NAME               MAIN-COMPONENT-NAME   STATUS      AGE
apecloud-mysql     mysql                 Available   44m
milvus             milvus                Available   21m
mongodb            mongodb               Available   21m
mongodb-sharding   mongos                Available   21m
postgresql         postgresql            Available   44m
qdrant             qdrant                Available   22m
redis              redis                             21m
weaviate           weaviate              Available   22m
➜  ~ k get cv
NAME                      CLUSTER-DEFINITION   STATUS        AGE
ac-mysql-8.0.30           apecloud-mysql       Available     44m
milvus-2.2.4              milvus               Available     21m
mongodb-5.0.14            mongodb              Available     22m
mongodb-sharding-5.0.14   mongodb-sharding     Available     22m
postgresql-12.14.0        postgresql           Available     21m
postgresql-14.7.0         postgresql           Unavailable   44m
postgresql-14.7.1         postgresql           Available     21m
qdrant-1.1.0              qdrant               Available     22m
redis-7.0.6               redis                Available     21m
weaviate-1.18.0           weaviate             Available     22m
apecloud-bot commented 1 year ago

Closing as this is an known limit that pg is not compatibility when upgrade from 0.4 to 0.5