apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.14k stars 176 forks source link

[BUG] kubeblocks create 3 replicas mysql cluster Failed , two pod mysql databases cannot be connected properly #1218

Closed linghan-hub closed 8 months ago

linghan-hub commented 1 year ago

Describe the bug kubeblocks create 3 replicas mysql cluster Failed , two pod mysql databases cannot be connected properly on local minikube 3 node environment

Steps to reproduce the behavior:

  1. create cluster
    kubectl apply -f - << EOF
    apiVersion: dbaas.kubeblocks.io/v1alpha1
    kind: Cluster
    metadata:
    name: cluster-wesql-3replica
    namespace: default
    spec:
    clusterDefinitionRef: apecloud-mysql
    clusterVersionRef: ac-mysql-8.0.30
    terminationPolicy: WipeOut
    components:
    - name: mysql
      type: mysql
      monitor: true
      replicas: 3
      volumeClaimTemplates:
        - name: data
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
        - name: log
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
    EOF
  2. All pods are running, but cluster is failed
    
    kubectl get cluster,pod
    NAME                                                 CLUSTER-DEFINITION   VERSION           TERMINATION-POLICY   STATUS   AGE
    cluster.dbaas.kubeblocks.io/cluster-wesql-3replica   apecloud-mysql       ac-mysql-8.0.30   WipeOut              Failed   19m

NAME READY STATUS RESTARTS AGE pod/cluster-wesql-3replica-mysql-0 4/4 Running 0 19m pod/cluster-wesql-3replica-mysql-1 4/4 Running 0 19m pod/cluster-wesql-3replica-mysql-2 4/4 Running 0 19m pod/kubeblocks-655565d78-n9ct7 1/1 Running 0 25m pod/kubeblocks-grafana-94c6975ff-s2wgk 3/3 Running 0 25m pod/kubeblocks-prometheus-alertmanager-0 2/2 Running 0 25m pod/kubeblocks-prometheus-server-0 1/2 CrashLoopBackOff 9 (2m58s ago) 25m

3. see logs

kubectl describe cluster cluster-wesql-3replica Name: cluster-wesql-3replica Namespace: default Labels: clusterdefinition.kubeblocks.io/name=apecloud-mysql clusterversion.kubeblocks.io/name=ac-mysql-8.0.30 Annotations: kubeblocks.io/storage-class: standard API Version: dbaas.kubeblocks.io/v1alpha1 Kind: Cluster Metadata: Creation Timestamp: 2023-02-03T06:53:02Z Finalizers: cluster.kubeblocks.io/finalizer Generation: 1 Managed Fields: API Version: dbaas.kubeblocks.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: .: f:kubectl.kubernetes.io/last-applied-configuration: f:spec: .: f:clusterDefinitionRef: f:clusterVersionRef: f:components: .: k:{"name":"mysql"}: .: f:monitor: f:name: f:replicas: f:serviceType: f:type: f:volumeClaimTemplates: f:terminationPolicy: Manager: kubectl-client-side-apply Operation: Update Time: 2023-02-03T06:53:02Z API Version: dbaas.kubeblocks.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: f:kubeblocks.io/storage-class: f:finalizers: .: v:"cluster.kubeblocks.io/finalizer": f:labels: .: f:clusterdefinition.kubeblocks.io/name: f:clusterversion.kubeblocks.io/name: Manager: manager Operation: Update Time: 2023-02-03T06:53:03Z API Version: dbaas.kubeblocks.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:status: .: f:clusterDefGeneration: f:components: .: f:mysql: .: f:consensusSetStatus: .: f:followers: f:leader: .: f:accessMode: f:name: f:pod: f:message: .: f:Pod/cluster-wesql-3replica-mysql-0: f:Pod/cluster-wesql-3replica-mysql-2: f:phase: f:podsReady: f:podsReadyTime: f:type: f:conditions: f:observedGeneration: f:operations: .: f:horizontalScalable: f:restartable: f:verticalScalable: f:phase: Manager: manager Operation: Update Subresource: status Time: 2023-02-03T06:54:15Z Resource Version: 2726 UID: c639c4da-a566-42b2-9071-edf48f6904a8 Spec: Cluster Definition Ref: apecloud-mysql Cluster Version Ref: ac-mysql-8.0.30 Components: Monitor: true Name: mysql Replicas: 3 Resources: Service Type: ClusterIP Type: mysql Volume Claim Templates: Name: data Spec: Access Modes: ReadWriteOnce Resources: Requests: Storage: 1Gi Name: log Spec: Access Modes: ReadWriteOnce Resources: Requests: Storage: 1Gi Termination Policy: WipeOut Status: Cluster Def Generation: 2 Components: Mysql: Consensus Set Status: Followers: Access Mode: Readonly Name: follower Pod: cluster-wesql-3replica-mysql-1 Leader: Access Mode: None Name: Pod: Unknown Message: Pod/cluster-wesql-3replica-mysql-0: Role probe timeout, check whether the application is available Pod/cluster-wesql-3replica-mysql-2: Role probe timeout, check whether the application is available Phase: Failed Pods Ready: true Pods Ready Time: 2023-02-03T06:53:15Z Type: mysql Conditions: Last Transition Time: 2023-02-03T06:53:02Z Message: The operator has started the provisioning of Cluster: cluster-wesql-3replica Reason: PreCheckSucceed Status: True Type: ProvisioningStarted Last Transition Time: 2023-02-03T06:53:03Z Message: Successfully applied for resources Reason: ApplyResourcesSucceed Status: True Type: ApplyResources Last Transition Time: 2023-02-03T06:53:15Z Message: all pods of components are ready, waiting for the probe detection successful Reason: AllReplicasReady Status: True Type: ReplicasReady Last Transition Time: 2023-02-03T06:54:15Z Message: pods are unavailable in Components: [mysql], refer to related component message in Cluster.status.components Reason: ComponentsNotReady Status: False Type: Ready Observed Generation: 1 Operations: Horizontal Scalable: Name: mysql Restartable: mysql Vertical Scalable: mysql Phase: Failed Events: Type Reason Age From Message


Normal Creating 20m cluster-controller Start Creating in Cluster: cluster-wesql-3replica Warning NotFound 20m (x8 over 20m) system-account-controller Endpoints "cluster-wesql-3replica-mysql" not found Normal PreCheckSucceed 20m (x2 over 20m) cluster-controller The operator has started the provisioning of Cluster: cluster-wesql-3replica Normal ApplyResourcesSucceed 20m cluster-controller Successfully applied for resources Normal AllReplicasReady 19m (x5 over 20m) cluster-controller all pods of components are ready, waiting for the probe detection successful Warning ProbeTimeout 19m stateful-set-controller pod role detection timed out in Component: mysql

kubectl get pod -L cs.dbaas.kubeblocks.io/role -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ROLE cluster-wesql-3replica-mysql-0 4/4 Running 0 21m 10.244.1.10 minikube-m02 cluster-wesql-3replica-mysql-1 4/4 Running 0 21m 10.244.0.4 minikube follower cluster-wesql-3replica-mysql-2 4/4 Running 0 21m 10.244.2.10 minikube-m03 kubeblocks-655565d78-n9ct7 1/1 Running 0 28m 10.244.1.3 minikube-m02 kubeblocks-grafana-94c6975ff-s2wgk 3/3 Running 0 28m 10.244.1.4 minikube-m02 kubeblocks-prometheus-alertmanager-0 2/2 Running 0 28m 10.244.1.5 minikube-m02 kubeblocks-prometheus-server-0 1/2 CrashLoopBackOff 10 (24s ago) 28m 10.244.1.6 minikube-m02

4. cluster-wesql-3replica-mysql-0 adn cluster-wesql-3replica-mysql-2 Unable to connect mysql , cluster-wesql-3replica-mysql-1 normal

kubectl exec -it cluster-wesql-3replica-mysql-0 -c mysql bash kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead. [root@cluster-wesql-3replica-mysql-0 /]# mysql -p$MYSQL_ROOT_PASSWORD mysql: [Warning] Using a password on the command line interface can be insecure. ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)

[root@cluster-wesql-3replica-mysql-0 /]# kubectl exec -it cluster-wesql-3replica-mysql-1 -c mysql bash kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead. [root@cluster-wesql-3replica-mysql-1 /]# mysql -p$MYSQL_ROOT_PASSWORD mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 14 Server version: 8.0.30 WeSQL Server - GPL, Release 5, Revision d6b8719

Copyright (c) 2000, 2022, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select * from information_schema.wesql_cluster_local; +-----------+--------------+------------------------------------------------------------------+--------------+---------------+----------------+----------+-----------+------------------+---------------------+---------------+ | SERVER_ID | CURRENT_TERM | CURRENT_LEADER | COMMIT_INDEX | LAST_LOG_TERM | LAST_LOG_INDEX | ROLE | VOTED_FOR | LAST_APPLY_INDEX | SERVER_READY_FOR_RW | INSTANCE_TYPE | +-----------+--------------+------------------------------------------------------------------+--------------+---------------+----------------+----------+-----------+------------------+---------------------+---------------+ | 2 | 27 | cluster-wesql-3replica-mysql-2.cluster-wesql-3replica-mysql-head | 13 | 27 | 13 | Follower | 3 | 12 | No | Normal | +-----------+--------------+------------------------------------------------------------------+--------------+---------------+----------------+----------+-----------+------------------+---------------------+---------------+ 1 row in set (0.00 sec)

kubectl exec -it cluster-wesql-3replica-mysql-2 -c mysql bash kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead. [root@cluster-wesql-3replica-mysql-2 /]# mysql -p$MYSQL_ROOT_PASSWORD mysql: [Warning] Using a password on the command line interface can be insecure. ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)

5.see logs

kubectl logs cluster-wesql-3replica-mysql-0 Defaulted container "mysql" out of: mysql, inject-mysql-exporter, kb-rolechangedcheck, config-manager-sidecar

cluster-wesql-3replica-mysql-0

no need to call add 0 cluster-wesql-3replica-mysql-0.cluster-wesql-3replica-mysql-headless:13306;cluster-wesql-3replica-mysql-1.cluster-wesql-3replica-mysql-headless:13306;cluster-wesql-3replica-mysql-2.cluster-wesql-3replica-mysql-headless:13306@1

docker-entrypoint.sh mysqld --defaults-file=/opt/mysql/my.cnf --cluster-start-index=1 --cluster-info="cluster-wesql-3replica-mysql-0.cluster-wesql-3replica-mysql-headless:13306;cluster-wesql-3replica-mysql-1.cluster-wesql-3replica-mysql-headless:13306;cluster-wesql-3replica-mysql-2.cluster-wesql-3replica-mysql-headless:13306@1" --cluster-id=1 2023-02-03 06:53:05+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server started. 2023-02-03 06:53:06+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql' 2023-02-03 06:53:06+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server started. '/var/lib/mysql/mysql.sock' -> '/var/run/mysqld/mysqld.sock' 2023-02-03T06:53:07.865781Z 0 [Warning] [MY-011068] [Server] The syntax 'slave_exec_mode' is deprecated and will be removed in a future release. Please use replica_exec_mode instead. 2023-02-03T06:53:07.888917Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.30) starting as process 7 2023-02-03T06:53:07.916496Z 0 [Warning] [MY-000054] [Server] World-writable config file '/data/mysql/data/auto.cnf' is ignored. 2023-02-03T06:53:07.916667Z 0 [Warning] [MY-010107] [Server] World-writable config file '/data/mysql/data/auto.cnf' has been removed. 2023-02-03T06:53:07.918837Z 0 [Warning] [MY-010075] [Server] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 6a678a86-a38f-11ed-8ab2-0adc03c90a03. 2023-02-03T06:53:07.938449Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started. 2023-02-03T06:53:08.651383Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended. 2023-02-03T06:53:09.028744Z 0 [System] [MY-010229] [Server] Starting XA crash recovery... 2023-02-03T06:53:09.059563Z 0 [System] [MY-010232] [Server] XA crash recovery finished. 2023-02-03T06:53:09.063238Z 0 [Warning] [MY-000000] [Server] Recover consensus index is 12 2023-02-03T06:53:09.175341Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed. 2023-02-03T06:53:09.175462Z 0 [System] [MY-013602] [Server] Channel mysqlmain configured to support TLS. Encrypted connections are now supported for this channel. [2023-02-03 06:53:10.682121] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:10.682121] [Default] Server 1 : Paxos state change from FOLL to CAND !! [2023-02-03 06:53:10.682121] [Default] Server 1 : Epoch task currentEpoch(0) during requestVote [2023-02-03 06:53:10.682121] [Default] Server 1 : Start new requestVote: new term(6) [2023-02-03 06:53:12.136826] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:12.136826] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:12.136826] [Default] Server 1 : Epoch task currentEpoch(2) during requestVote [2023-02-03 06:53:12.136826] [Default] Server 1 : Start new requestVote: new term(7) [2023-02-03 06:53:13.581527] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:13.581527] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:13.581527] [Default] Server 1 : Epoch task currentEpoch(4) during requestVote [2023-02-03 06:53:13.581527] [Default] Server 1 : Start new requestVote: new term(8) [2023-02-03 06:53:15.054845] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:15.054845] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:15.054845] [Default] Server 1 : Epoch task currentEpoch(6) during requestVote [2023-02-03 06:53:15.054845] [Default] Server 1 : Start new requestVote: new term(9) [2023-02-03 06:53:16.639952] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:16.639952] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:16.639952] [Default] Server 1 : Epoch task currentEpoch(8) during requestVote [2023-02-03 06:53:16.639952] [Default] Server 1 : Start new requestVote: new term(10) [2023-02-03 06:53:18.144958] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:18.144958] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:18.144958] [Default] Server 1 : Epoch task currentEpoch(10) during requestVote [2023-02-03 06:53:18.144958] [Default] Server 1 : Start new requestVote: new term(11) [2023-02-03 06:53:19.569252] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:19.569252] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:19.569252] [Default] Server 1 : Epoch task currentEpoch(12) during requestVote [2023-02-03 06:53:19.569252] [Default] Server 1 : Start new requestVote: new term(12) [2023-02-03 06:53:21.038265] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:21.038265] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:21.038265] [Default] Server 1 : Epoch task currentEpoch(14) during requestVote [2023-02-03 06:53:21.038265] [Default] Server 1 : Start new requestVote: new term(13) [2023-02-03 06:53:22.517921] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:22.517921] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:22.517921] [Default] Server 1 : Epoch task currentEpoch(16) during requestVote [2023-02-03 06:53:22.517921] [Default] Server 1 : Start new requestVote: new term(14) [2023-02-03 06:53:24.070103] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:24.070103] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:24.070103] [Default] Server 1 : Epoch task currentEpoch(18) during requestVote [2023-02-03 06:53:24.070103] [Default] Server 1 : Start new requestVote: new term(15) [2023-02-03 06:53:25.549948] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:25.549948] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:25.549948] [Default] Server 1 : Epoch task currentEpoch(20) during requestVote [2023-02-03 06:53:25.549948] [Default] Server 1 : Start new requestVote: new term(16) [2023-02-03 06:53:27.076833] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:27.076833] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:27.076833] [Default] Server 1 : Epoch task currentEpoch(22) during requestVote [2023-02-03 06:53:27.076833] [Default] Server 1 : Start new requestVote: new term(17) [2023-02-03 06:53:28.536479] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:28.536479] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:28.536479] [Default] Server 1 : Epoch task currentEpoch(24) during requestVote [2023-02-03 06:53:28.536479] [Default] Server 1 : Start new requestVote: new term(18) [2023-02-03 06:53:30.030237] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:30.030237] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:30.030237] [Default] Server 1 : Epoch task currentEpoch(26) during requestVote [2023-02-03 06:53:30.030237] [Default] Server 1 : Start new requestVote: new term(19) [2023-02-03 06:53:31.487485] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:31.487485] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:31.487485] [Default] Server 1 : Epoch task currentEpoch(28) during requestVote [2023-02-03 06:53:31.487485] [Default] Server 1 : Start new requestVote: new term(20) [2023-02-03 06:53:33.003761] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:33.003761] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:33.003761] [Default] Server 1 : Epoch task currentEpoch(30) during requestVote [2023-02-03 06:53:33.003761] [Default] Server 1 : Start new requestVote: new term(21) [2023-02-03 06:53:34.550344] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:34.550344] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:34.550344] [Default] Server 1 : Epoch task currentEpoch(32) during requestVote [2023-02-03 06:53:34.550344] [Default] Server 1 : Start new requestVote: new term(22) [2023-02-03 06:53:36.076752] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:36.076752] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:36.076752] [Default] Server 1 : Epoch task currentEpoch(34) during requestVote [2023-02-03 06:53:36.076752] [Default] Server 1 : Start new requestVote: new term(23) [2023-02-03 06:53:37.546530] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:37.546530] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:37.546530] [Default] Server 1 : Epoch task currentEpoch(36) during requestVote [2023-02-03 06:53:37.546530] [Default] Server 1 : Start new requestVote: new term(24) [2023-02-03 06:53:39.054754] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:39.054754] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:39.054754] [Default] Server 1 : Epoch task currentEpoch(38) during requestVote [2023-02-03 06:53:39.054754] [Default] Server 1 : Start new requestVote: new term(25) [2023-02-03 06:53:40.577980] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:40.577980] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:40.577980] [Default] Server 1 : Epoch task currentEpoch(40) during requestVote [2023-02-03 06:53:40.577980] [Default] Server 1 : Start new requestVote: new term(26) [2023-02-03 06:53:40.587714] [Default] EasyNet::onConnected server 2 [2023-02-03 06:53:40.588299] [Default] EasyNet::onConnected server 3 [2023-02-03 06:53:42.014905] [Default] Server 1 : Enter startElectionCallback [2023-02-03 06:53:42.014905] [Default] Server 1 : Paxos state change from CAND to CAND !! [2023-02-03 06:53:42.014905] [Default] Server 1 : Epoch task currentEpoch(42) during requestVote [2023-02-03 06:53:42.014905] [Default] Server 1 : Start new requestVote: new term(27) [2023-02-03 06:53:42.020134] [Default] Server 1 : leaderStickiness check: msg::force(0) state:1 electionTimer::Stage:0 leaderId:0 . [2023-02-03 06:53:42.020134] [Default] Server 1 : isVote: 0, local(lli:12, llt:5); msg(candidateid: 3, term: 27 lli:12, llt:5) . [2023-02-03 06:53:42.023056] [Default] Server 1 : server 3 refuse to let me became leader. [2023-02-03 06:53:42.026537] [Default] Server 1 : server 2 refuse to let me became leader. [2023-02-03 06:53:42.027838] [Default] Server 1 : Paxos state change from CAND to FOLL !! 2023-02-03T06:53:42.257106Z 5 [System] [MY-025002] [Server] Consensus apply thread start, recover status = 0, consensus start apply index = 0, rli consensus index = 12. 2023-02-03T06:53:42.257900Z 5 [System] [MY-025003] [Server] Consensus apply thread group relay log file name = './mysql-bin.000001', pos = 3609, rli apply index = 12. 2023-02-03T06:53:42.257923Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.30' socket: '/var/run/mysqld/mysqld.sock' port: 3306 WeSQL Server - GPL, Release 5, Revision d6b8719.

ahjing99 commented 8 months ago

closing as this issue is out of date, will reopen if failed again