Closed ahjing99 closed 5 months ago
vscale with offline instances works well in InstanceSet Controller. there are two issues need to be investigated/discussed further:
asmysql-nhgrgg-mysql-2
turned to Error
after vscale successfully @xuriwuyun OpsRequest
to Failed
under this circumstance @wangyelei
➜ ~ kbcli version Kubernetes: v1.29.4-gke.1043002 KubeBlocks: 0.9.0-beta.37 kbcli: 0.9.0-beta.27
https://github.com/apecloud/kubeblocks/actions/runs/9644086659/job/26596632960
1.create and onlineInstancesToOffline
➜ ~ k get pod NAME READY STATUS RESTARTS AGE asmysql-nhgrgg-mysql-1 3/4 Terminating 0 7m47s asmysql-nhgrgg-mysql-2 3/4 Error 1 (15s ago) 40s asmysql-nhgrgg-mysql-3 4/4 Running 0 102s
➜ ~ k logs asmysql-nhgrgg-mysql-2 --previous Defaulted container "mysql" out of: mysql, metrics, lorry, config-manager, init-data (init), init-syncer (init), init-xtrabackup (init) 2024-06-25T10:38:53Z INFO Initialize DB manager 2024-06-25T10:38:53Z INFO KB_WORKLOAD_TYPE ENV not set 2024-06-25T10:38:53Z INFO HTTPServer Starting HTTP Server 2024-06-25T10:38:53Z INFO HTTPServer API route path {"method": "POST", "path": ["/v1.0/rebuild", "/v1.0/start", "/v1.0/stop", "/v1.0/switchover"]} 2024-06-25T10:38:53Z INFO HTTPServer API route path {"method": "GET", "path": ["/v1.0/datasync"]} 2024-06-25T10:38:53Z INFO HTTPServer http server {"listen address": "0.0.0.0", "port": 3601} 2024-06-25T10:38:53Z INFO HA HA starting 2024-06-25T10:38:53Z INFO pinger Waiting for dns resolution to be ready 2024-06-25T10:38:53Z INFO pinger dns resolution is ready {"dns": "asmysql-nhgrgg-mysql-2.asmysql-nhgrgg-mysql-headless.default.svc"} 2024-06-25T10:38:53Z INFO DCS-K8S pod selector: app.kubernetes.io/instance=asmysql-nhgrgg,app.kubernetes.io/managed-by=kubeblocks,apps.kubeblocks.io/component-name=mysql 2024-06-25T10:38:53Z INFO DCS-K8S podlist: 3 2024-06-25T10:38:53Z DEBUG HA cluster info {"cluster": {"ClusterCompName":"asmysql-nhgrgg-mysql","Namespace":"default","Replicas":3,"HaConfig":{"ClusterInitializeOwner":"asmysql-nhgrgg-mysql-1","DeleteMembers":{"asmysql-nhgrgg-mysql-0":{"UID":"67ac2a57-b8ab-412d-bf14-566ae488ce89","IsFinished":false}}},"Leader":{"DBState":{"OpTimestamp":1719311869,"Extra":{"Binlog_File":"asmysql-nhgrgg-mysql-1-bin.000005","Binlog_Pos":"","gtid_executed":"25b31826-32de-11ef-9c92-72e087663138:1-87","gtid_purged":"","hostname":"asmysql-nhgrgg-mysql-1","read_only":"0","server_uuid":"25b31826-32de-11ef-9c92-72e087663138","super_read_only":"0"}},"Index":"415570","Name":"asmysql-nhgrgg-mysql-1","AcquireTime":1719311919,"RenewTime":1719311919,"TTL":15,"Resource":{"metadata":{"name":"asmysql-nhgrgg-mysql-leader","namespace":"default","uid":"f4d4d891-7694-46a8-a713-b733ea70fb4c","resourceVersion":"415570","creationTimestamp":"2024-06-25T10:32:49Z","labels":{"app.kubernetes.io/instance":"asmysql-nhgrgg","app.kubernetes.io/managed-by":"kubeblocks","apps.kubeblocks.io/component-name":"mysql"},"annotations":{"acquire-time":"1719311919","dbstate":"{\"OpTimestamp\":1719311869,\"Extra\":{\"Binlog_File\":\"asmysql-nhgrgg-mysql-1-bin.000005\",\"Binlog_Pos\":\"\",\"gtid_executed\":\"25b31826-32de-11ef-9c92-72e087663138:1-87\",\"gtid_purged\":\"\",\"hostname\":\"asmysql-nhgrgg-mysql-1\",\"read_only\":\"0\",\"server_uuid\":\"25b31826-32de-11ef-9c92-72e087663138\",\"super_read_only\":\"0\"}}","leader":"asmysql-nhgrgg-mysql-1","renew-time":"1719311919","ttl":"15"},"ownerReferences":[{"apiVersion":"apps.kubeblocks.io/v1alpha1","kind":"Cluster","name":"asmysql-nhgrgg","uid":"303192ba-388e-4c19-bf1b-df685f69f085"}],"managedFields":[{"manager":"syncer","operation":"Update","apiVersion":"v1","time":"2024-06-25T10:38:39Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotations":{".":{},"f:acquire-time":{},"f:dbstate":{},"f:leader":{},"f:renew-time":{},"f:ttl":{}},"f:labels":{".":{},"f:app.kubernetes.io/instance":{},"f:app.kubernetes.io/managed-by":{},"f:apps.kubeblocks.io/component-name":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"303192ba-388e-4c19-bf1b-df685f69f085\"}":{}}}}}]}}},"Members":[{"Index":"","Name":"asmysql-nhgrgg-mysql-1","Role":"primary","PodIP":"10.92.3.211","DBPort":"3306","SyncerPort":"3601","UID":"b00b0a7b-b1e9-41e0-b487-a18331dfa325","ComponentName":"mysql","UseIP":false},{"Index":"","Name":"asmysql-nhgrgg-mysql-2","Role":"secondary","PodIP":"10.92.0.122","DBPort":"3306","SyncerPort":"3601","UID":"396afb35-e82f-4c2f-bd88-be5aa8151c41","ComponentName":"mysql","UseIP":false},{"Index":"","Name":"asmysql-nhgrgg-mysql-3","Role":"secondary","PodIP":"10.92.1.107","DBPort":"3306","SyncerPort":"3601","UID":"3c56b030-5f1c-49f2-8877-a98682674fb3","ComponentName":"mysql","UseIP":false}],"Switchover":null,"Extra":null,"Resource":{"metadata":{"name":"asmysql-nhgrgg","namespace":"default","uid":"303192ba-388e-4c19-bf1b-df685f69f085","resourceVersion":"414265","generation":4,"creationTimestamp":"2024-06-25T10:31:20Z","labels":{"clusterdefinition.kubeblocks.io/name":"mysql","clusterversion.kubeblocks.io/name":"mysql-8.0.33"},"annotations":{"kubeblocks.io/ops-request":"[{\"name\":\"asmysql-nhgrgg-verticalscaling-rr72s\",\"type\":\"VerticalScaling\"}]"},"finalizers":["cluster.kubeblocks.io/finalizer"],"managedFields":[{"manager":"kbcli","operation":"Update","apiVersion":"apps.kubeblocks.io/v1alpha1","time":"2024-06-25T10:31:20Z","fieldsType":"FieldsV1","fieldsV1":{"f:spec":{".":{},"f:affinity":{".":{},"f:podAntiAffinity":{},"f:tenancy":{}},"f:clusterDefinitionRef":{},"f:clusterVersionRef":{},"f:resources":{".":{},"f:cpu":{},"f:memory":{}},"f:storage":{".":{},"f:size":{}},"f:terminationPolicy":{}}}},{"manager":"manager","operation":"Update","apiVersion":"apps.kubeblocks.io/v1alpha1","time":"2024-06-25T10:36:54Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotations":{".":{},"f:kubeblocks.io/ops-request":{}},"f:finalizers":{".":{},"v:\"cluster.kubeblocks.io/finalizer\"":{}},"f:labels":{".":{},"f:clusterdefinition.kubeblocks.io/name":{},"f:clusterversion.kubeblocks.io/name":{}}},"f:spec":{"f:componentSpecs":{}}}},{"manager":"manager","operation":"Update","apiVersion":"apps.kubeblocks.io/v1alpha1","time":"2024-06-25T10:36:56Z","fieldsType":"FieldsV1","fieldsV1":{"f:status":{".":{},"f:clusterDefGeneration":{},"f:components":{".":{},"f:mysql":{".":{},"f:phase":{},"f:podsReady":{},"f:podsReadyTime":{}}},"f:conditions":{},"f:observedGeneration":{},"f:phase":{}}},"subresource":"status"}]},"spec":{"clusterDefinitionRef":"mysql","clusterVersionRef":"mysql-8.0.33","terminationPolicy":"Halt","componentSpecs":[{"name":"mysql","componentDefRef":"mysql","enabledLogs":["auditlog","error","slow"],"replicas":3,"resources":{"limits":{"cpu":"200m","memory":"644245094400m"},"requests":{"cpu":"200m","memory":"644245094400m"}},"volumeClaimTemplates":[{"name":"data","spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"10Gi"}}}}],"switchPolicy":{"type":"Noop"},"serviceAccountName":"kb-asmysql-nhgrgg"}],"affinity":{"podAntiAffinity":"Preferred","tenancy":"SharedNode"},"resources":{"cpu":"0","memory":"0"},"storage":{"size":"0"},"monitor":{}},"status":{"observedGeneration":4,"phase":"Updating","components":{"mysql":{"phase":"Updating","podsReady":false,"podsReadyTime":"2024-06-25T10:36:03Z"}},"clusterDefGeneration":2,"conditions":[{"type":"ProvisioningStarted","status":"True","observedGeneration":4,"lastTransitionTime":"2024-06-25T10:31:20Z","reason":"PreCheckSucceed","message":"The operator has started the provisioning of Cluster: asmysql-nhgrgg"},{"type":"ApplyResources","status":"True","observedGeneration":4,"lastTransitionTime":"2024-06-25T10:31:20Z","reason":"ApplyResourcesSucceed","message":"Successfully applied for resources"},{"type":"ReplicasReady","status":"False","lastTransitionTime":"2024-06-25T10:36:56Z","reason":"ReplicasNotReady","message":"pods are not ready in Components: [mysql], refer to related component message in Cluster.status.components"},{"type":"Ready","status":"False","lastTransitionTime":"2024-06-25T10:36:56Z","reason":"ComponentsNotReady","message":"pods are unavailable in Components: [mysql], refer to related component message in Cluster.status.components"}]}}}} 2024-06-25T10:38:53Z INFO HA The cluster identifier is initialized. {"Cluster Initialize Owner": "asmysql-nhgrgg-mysql-1"} 2024-06-25T10:38:53Z INFO Hypervisor Starting Hypervisor 2024-06-25T10:38:53Z INFO Hypervisor Start DB Service {"command": "/usr/bin/bash -c mv /var/lib/mysql/plugin/audit_log.so /usr/lib64/mysql/plugin/\nrm -rf /var/lib/mysql/plugin\nchown -R mysql:root /var/lib/mysql\nskip_replica_start=\"OFF\"\nif [ -f /var/lib/mysql/data/.restore_new_cluster ]; then\n skip_replica_start=\"ON\"\nfi\ndocker-entrypoint.sh mysqld --server-id $(( ${KB_POD_NAME##*-} + 1)) \\n--plugin-load-add=rpl_semi_sync_source=semisync_source.so \\n--plugin-load-add=rpl_semi_sync_replica=semisync_replica.so \\n--plugin-load-add=audit_log=audit_log.so \\n--log-bin=/var/lib/mysql/binlog/asmysql-nhgrgg-mysql-2-bin \\n--skip-replica-start=$skip_replica_start\n"} 2024-06-25T10:38:53Z INFO Hypervisor Starting watcher on dbService 2024-06-25T10:38:53Z INFO MySQL DB is not ready {"error": "dial tcp 127.0.0.1:3306: connect: connection refused"} == DB ERR == mv: cannot stat '/var/lib/mysql/plugin/audit_log.so': No such file or directory == DB == 2024-06-25 10:38:53+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.0.33-1.el8 started. == DB == 2024-06-25 10:38:55+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql' == DB == 2024-06-25 10:38:55+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.0.33-1.el8 started. == DB == '/var/lib/mysql/mysql.sock' -> '/var/run/mysqld/mysqld.sock' 2024-06-25T10:38:58Z INFO MySQL wait for db service to be ready 2024-06-25T10:38:58Z INFO MySQL DB is not ready {"error": "dial tcp 127.0.0.1:3306: connect: connection refused"} 2024-06-25T10:39:03Z INFO MySQL wait for db service to be ready 2024-06-25T10:39:03Z INFO MySQL Set semi-sync source timeout {"sql": "SET GLOBAL rpl_semi_sync_source_timeout = 4294967295;", "leader": "asmysql-nhgrgg-mysql-1"} panic: start DB service failed: SET GLOBAL rpl_semi_sync_source_timeout = 4294967295; execute failed: dial tcp 10.92.3.211:3306: connect: connection refused
goroutine 68 [running]: github.com/apecloud/syncer/highavailability.(*Ha).Start(0xc00034cba0) /src/highavailability/ha.go:262 +0xadb created by main.main in goroutine 1 /src/cmd/syncer/main.go:111 +0x4ca
asmysql-nhgrgg-mysql-2 crash cause ops failed
➜ ~ kbcli cluster describe-ops asmysql-nhgrgg-verticalscaling-rr72s -n default Spec: Name: asmysql-nhgrgg-verticalscaling-rr72s NameSpace: default Cluster: asmysql-nhgrgg Type: VerticalScaling
Command: kbcli cluster vscale asmysql-nhgrgg --components=mysql --cpu=200m --memory=644245094400m --namespace=default
Last Configuration: COMPONENT REQUEST-CPU REQUEST-MEMORY LIMIT-CPU LIMIT-MEMORY mysql 100m 512Mi 100m 512Mi
Status: Start Time: Jun 25,2024 18:36 UTC+0800 Completion Time: Jun 25,2024 18:39 UTC+0800 Duration: 2m25s Status: Failed Progress: 2/3 OBJECT-KEY STATUS DURATION MESSAGE Pod/asmysql-nhgrgg-mysql-2 Failed 53s Failed to vertical scale: Pod/asmysql-nhgrgg-mysql-2 in Component: mysql, message: Pod/asmysql-nhgrgg-mysql-1 Processing 2m6s Start to vertical scale: Pod/asmysql-nhgrgg-mysql-1 in Component: mysql Pod/asmysql-nhgrgg-mysql-3 Succeed 63s Successfully vertical scale: Pod/asmysql-nhgrgg-mysql-3 in Component: mysql
Conditions: LAST-TRANSITION-TIME TYPE REASON STATUS MESSAGE Jun 25,2024 18:36 UTC+0800 WaitForProgressing WaitForProgressing True wait for the controller to process the OpsRequest: asmysql-nhgrgg-verticalscaling-rr72s in Cluster: asmysql-nhgrgg Jun 25,2024 18:36 UTC+0800 Validated ValidateOpsRequestPassed True OpsRequest: asmysql-nhgrgg-verticalscaling-rr72s is validated Jun 25,2024 18:36 UTC+0800 VerticalScaling VerticalScalingStarted True Start to vertical scale resources in Cluster: asmysql-nhgrgg Jun 25,2024 18:39 UTC+0800 Failed OpsRequestFailed False Failed to process OpsRequest: asmysql-nhgrgg-verticalscaling-rr72s in cluster: asmysql-nhgrgg, more detailed informations in status.components
but the cluster is running
➜ ~ kbcli cluster describe asmysql-nhgrgg Name: asmysql-nhgrgg Created Time: Jun 25,2024 18:31 UTC+0800 NAMESPACE CLUSTER-DEFINITION VERSION STATUS TERMINATION-POLICY default mysql mysql-8.0.33 Running Halt
Endpoints: COMPONENT MODE INTERNAL EXTERNAL mysql ReadWrite asmysql-nhgrgg-mysql.default.svc.cluster.local:3306
Topology: COMPONENT INSTANCE ROLE STATUS AZ NODE CREATED-TIME mysql asmysql-nhgrgg-mysql-1 secondary Running us-central1-c gke-yjtest-default-pool-df8d642f-q6jg/10.128.0.25 Jun 25,2024 18:39 UTC+0800 mysql asmysql-nhgrgg-mysql-2 secondary Running us-central1-c gke-yjtest-default-pool-df8d642f-6hdv/10.128.0.28 Jun 25,2024 18:38 UTC+0800 mysql asmysql-nhgrgg-mysql-3 primary Running us-central1-c gke-yjtest-default-pool-df8d642f-m01r/10.128.0.30 Jun 25,2024 18:37 UTC+0800
Resources Allocation: COMPONENT DEDICATED CPU(REQUEST/LIMIT) MEMORY(REQUEST/LIMIT) STORAGE-SIZE STORAGE-CLASS mysql false 200m / 200m 644245094400m / 644245094400m data:10Gi kb-default-sc
Images: COMPONENT TYPE IMAGE mysql mysql docker.io/apecloud/mysql:8.0.33
Data Protection: BACKUP-REPO AUTO-BACKUP BACKUP-SCHEDULE BACKUP-METHOD BACKUP-RETENTION RECOVERABLE-TIME
Show cluster events: kbcli cluster list-events -n default asmysql-nhgrgg
➜ ~ kbcli cluster volume-expand asmysql-nhgrgg --auto-approve --force=true --components mysql --volume-claim-templates data --storage 11Gi OpsRequest asmysql-nhgrgg-volumeexpansion-k7sm7 created successfully, you can view the progress: kbcli cluster describe-ops asmysql-nhgrgg-volumeexpansion-k7sm7 -n default
➜ ~ k describe ops asmysql-nhgrgg-volumeexpansion-k7sm7 Name: asmysql-nhgrgg-volumeexpansion-k7sm7 Namespace: default Labels: app.kubernetes.io/instance=asmysql-nhgrgg app.kubernetes.io/managed-by=kubeblocks ops.kubeblocks.io/ops-type=VolumeExpansion Annotations:
API Version: apps.kubeblocks.io/v1alpha1
Kind: OpsRequest
Metadata:
Creation Timestamp: 2024-06-25T10:42:15Z
Finalizers:
opsrequest.kubeblocks.io/finalizer
Generate Name: asmysql-nhgrgg-volumeexpansion-
Generation: 2
Managed Fields:
API Version: apps.kubeblocks.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:generateName:
f:labels:
.:
f:app.kubernetes.io/instance:
f:app.kubernetes.io/managed-by:
f:spec:
.:
f:clusterName:
f:force:
f:preConditionDeadlineSeconds:
f:type:
f:volumeExpansion:
.:
k:{"componentName":"mysql"}:
.:
f:componentName:
f:volumeClaimTemplates:
.:
k:{"name":"data"}:
.:
f:name:
f:storage:
Manager: kbcli
Operation: Update
Time: 2024-06-25T10:42:15Z
API Version: apps.kubeblocks.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.:
v:"opsrequest.kubeblocks.io/finalizer":
f:labels:
f:ops.kubeblocks.io/ops-type:
f:ownerReferences:
.:
k:{"uid":"303192ba-388e-4c19-bf1b-df685f69f085"}:
Manager: manager
Operation: Update
Time: 2024-06-25T10:42:15Z
API Version: apps.kubeblocks.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:clusterGeneration:
f:components:
.:
f:mysql:
.:
f:progressDetails:
f:conditions:
.:
k:{"type":"Validated"}:
.:
f:lastTransitionTime:
f:message:
f:reason:
f:status:
f:type:
k:{"type":"VolumeExpanding"}:
.:
f:lastTransitionTime:
f:message:
f:reason:
f:status:
f:type:
k:{"type":"WaitForProgressing"}:
.:
f:lastTransitionTime:
f:message:
f:reason:
f:status:
f:type:
f:lastConfiguration:
.:
f:components:
.:
f:mysql:
f:phase:
f:progress:
f:startTimestamp:
Manager: manager
Operation: Update
Subresource: status
Time: 2024-06-25T10:43:33Z
Owner References:
API Version: apps.kubeblocks.io/v1alpha1
Kind: Cluster
Name: asmysql-nhgrgg
UID: 303192ba-388e-4c19-bf1b-df685f69f085
Resource Version: 419290
UID: 4d462ff4-e627-4460-aa5e-6cf44d28e8b3
Spec:
Cluster Name: asmysql-nhgrgg
Force: true
Pre Condition Deadline Seconds: 0
Type: VolumeExpansion
Volume Expansion:
Component Name: mysql
Volume Claim Templates:
Name: data
Storage: 11Gi
Status:
Cluster Generation: 5
Components:
Mysql:
Progress Details:
Group: data
Message: Successfully expand volume: PVC/data-asmysql-nhgrgg-mysql-2 in component: mysql
Object Key: PVC/data-asmysql-nhgrgg-mysql-2
Status: Succeed
Group: data
Message: Successfully expand volume: PVC/data-asmysql-nhgrgg-mysql-1 in component: mysql
Object Key: PVC/data-asmysql-nhgrgg-mysql-1
Status: Succeed
Conditions:
Last Transition Time: 2024-06-25T10:42:15Z
Message: wait for the controller to process the OpsRequest: asmysql-nhgrgg-volumeexpansion-k7sm7 in Cluster: asmysql-nhgrgg
Reason: WaitForProgressing
Status: True
Type: WaitForProgressing
Last Transition Time: 2024-06-25T10:42:15Z
Message: OpsRequest: asmysql-nhgrgg-volumeexpansion-k7sm7 is validated
Reason: ValidateOpsRequestPassed
Status: True
Type: Validated
Last Transition Time: 2024-06-25T10:42:15Z
Message: Start to expand the volumes in Cluster: asmysql-nhgrgg
Reason: VolumeExpansionStarted
Status: True
Type: VolumeExpanding
Last Configuration:
Components:
Mysql:
Phase: Running
Progress: 2/3
Start Timestamp: 2024-06-25T10:42:15Z
Events:
Type Reason Age From Message
Normal WaitForProgressing 4m45s ops-request-controller wait for the controller to process the OpsRequest: asmysql-nhgrgg-volumeexpansion-k7sm7 in Cluster: asmysql-nhgrgg Normal ValidateOpsRequestPassed 4m45s (x2 over 4m45s) ops-request-controller OpsRequest: asmysql-nhgrgg-volumeexpansion-k7sm7 is validated Normal VolumeExpansionStarted 4m45s (x2 over 4m45s) ops-request-controller Start to expand the volumes in Cluster: asmysql-nhgrgg