Open JashBook opened 5 days ago
The FE pod has a post-start hook script used to set the root account password. There is an SQL command in the script that is getting stuck: mysql --connect-timeout=1 -h127.0.0.1 -uroot -P9030 -px xxxxxxxx -e show databases
.
Attempting to establish a new connection using the MySQL client also gets stuck.
The fe-1 pod is functioning normally, and using the MySQL client to connect and execute the SQL command show frontends
shows that both FEs are operating normally.
The log of fe-0: fe.log
The stack of fe-0: stack.log
The gc stat of fe-0:
The jvm flags of fe-0:
root@strsent-nerqht-fe-0:/opt/starrocks# jcmd 10 VM.flags
10:
-XX:-AlwaysTenure -XX:CICompilerCount=2 -XX:+CMSClassUnloadingEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:-CMSParallelRemarkEnabled -XX:ConcGCThreads=1 -XX:G1ConcRefinementThreads=2 -XX:G1HeapRegionSize=2097152 -XX:GCDrainStackTargetSize=64 -XX:InitialHeapSize=33554432 -XX:MarkStackSize=4194304 -XX:MaxHeapSize=8589934592 -XX:MaxNewSize=5152702464 -XX:MaxTenuringThreshold=7 -XX:MinHeapDeltaBytes=2097152 -XX:-NeverTenure -XX:NonNMethodCodeHeapSize=5825164 -XX:NonProfiledCodeHeapSize=122916538 -XX:ProfiledCodeHeapSize=122916538 -XX:ReservedCodeCacheSize=251658240 -XX:+SegmentedCodeCache -XX:SoftRefLRUPolicyMSPerMB=0 -XX:SurvivorRatio=8 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC
The fe.conf of fe-0:
Describe the bug A clear and concise description of what the bug is.
To Reproduce Steps to reproduce the behavior:
kbcli cluster hscale strsent-nerqht --auto-approve --force=true --components be --replicas 2 --namespace default
kbcli cluster vscale strsent-nerqht --auto-approve --force=true --components fe --cpu 1100m --memory 2Gi --namespace default
➜ ~ kubectl get pod -l app.kubernetes.io/instance=strsent-nerqht NAME READY STATUS RESTARTS AGE strsent-nerqht-be-0 3/3 Running 0 23m strsent-nerqht-be-1 3/3 Running 0 22m strsent-nerqht-fe-0 0/3 PodInitializing 0 20m strsent-nerqht-fe-1 3/3 Running 0 20m ➜ ~ ➜ ~ kubectl get ops -l app.kubernetes.io/instance=strsent-nerqht NAME TYPE CLUSTER STATUS PROGRESS AGE strsent-nerqht-verticalscaling-9vbgm VerticalScaling strsent-nerqht Running 1/2 20m ➜ ~ ➜ ~ ➜ ~ kubectl get cluster strsent-nerqht NAME CLUSTER-DEFINITION VERSION TERMINATION-POLICY STATUS AGE strsent-nerqht Delete Updating 43m
kubectl describe cluster strsent-nerqht Name: strsent-nerqht Namespace: default Labels: app.kubernetes.io/instance=strsent-nerqht Annotations: kubeblocks.io/ops-request: [{"name":"strsent-nerqht-verticalscaling-9vbgm","type":"VerticalScaling"}] kubeblocks.io/reconcile: 2024-06-28T01:35:33.833177453Z API Version: apps.kubeblocks.io/v1alpha1 Kind: Cluster Metadata: Creation Timestamp: 2024-06-28T01:34:19Z Finalizers: cluster.kubeblocks.io/finalizer Generation: 12 Managed Fields: API Version: apps.kubeblocks.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: .: f:kubectl.kubernetes.io/last-applied-configuration: f:spec: .: f:terminationPolicy: Manager: kubectl-client-side-apply Operation: Update Time: 2024-06-28T01:34:19Z API Version: apps.kubeblocks.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:labels: .: f:app.kubernetes.io/instance: Manager: kbcli Operation: Update Time: 2024-06-28T01:36:43Z API Version: apps.kubeblocks.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: f:kubeblocks.io/ops-request: f:kubeblocks.io/reconcile: f:finalizers: .: v:"cluster.kubeblocks.io/finalizer": f:spec: f:componentSpecs: f:resources: .: f:cpu: f:memory: f:services: f:storage: .: f:size: Manager: manager Operation: Update Time: 2024-06-28T01:56:12Z API Version: apps.kubeblocks.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:status: .: f:components: .: f:be: .: f:phase: f:podsReady: f:podsReadyTime: f:fe: .: f:phase: f:podsReady: f:podsReadyTime: f:conditions: f:observedGeneration: f:phase: Manager: manager Operation: Update Subresource: status Time: 2024-06-28T01:56:14Z Resource Version: 410692079 UID: 9b4a3e93-5c2c-4d6a-8240-f6d742bd3e4c Spec: Component Specs: Component Def: starrocks-be Name: be Replicas: 2 Resources: Limits: Cpu: 1100m Memory: 2Gi Requests: Cpu: 1100m Memory: 2Gi Service Version: 3.2.2 Volume Claim Templates: Name: data Spec: Access Modes: ReadWriteOnce Resources: Requests: Storage: 20Gi Component Def: starrocks-fe-sn Name: fe Replicas: 2 Resources: Limits: Cpu: 1100m Memory: 2Gi Requests: Cpu: 1100m Memory: 2Gi Service Version: 3.2.2 Volume Claim Templates: Name: data Spec: Access Modes: ReadWriteOnce Resources: Requests: Storage: 24Gi Resources: Cpu: 0 Memory: 0 Services: Annotations: networking.gke.io/load-balancer-type: Internal Component Selector: fe Name: fe-vpc Service Name: fe-vpc Spec: Ports: Name: fe-http Node Port: 30243 Port: 8030 Protocol: TCP Target Port: http-port Name: fe-mysql Node Port: 30369 Port: 9030 Protocol: TCP Target Port: query-port Type: LoadBalancer Storage: Size: 0 Termination Policy: Delete Status: Components: Be: Phase: Running Pods Ready: true Pods Ready Time: 2024-06-28T01:56:14Z Fe: Phase: Updating Pods Ready: false Pods Ready Time: 2024-06-28T01:54:53Z Conditions: Last Transition Time: 2024-06-28T01:34:19Z Message: The operator has started the provisioning of Cluster: strsent-nerqht Observed Generation: 12 Reason: PreCheckSucceed Status: True Type: ProvisioningStarted Last Transition Time: 2024-06-28T01:38:05Z Message: Successfully applied for resources Observed Generation: 12 Reason: ApplyResourcesSucceed Status: True Type: ApplyResources Last Transition Time: 2024-06-28T01:56:13Z Message: pods are not ready in Components: [fe], refer to related component message in Cluster.status.components Reason: ReplicasNotReady Status: False Type: ReplicasReady Last Transition Time: 2024-06-28T01:56:13Z Message: pods are unavailable in Components: [fe], refer to related component message in Cluster.status.components Reason: ComponentsNotReady Status: False Type: Ready Observed Generation: 12 Phase: Updating Events: Type Reason Age From Message
Normal ComponentPhaseTransition 47m (x2 over 47m) cluster-controller component is Creating Warning Unhealthy 45m (x5 over 46m) event-controller Pod strsent-nerqht-be-0: Startup probe failed: Get "http://10.128.2.63:8040/api/health": dial tcp 10.128.2.63:8040: connect: connection refused Normal AllReplicasReady 45m cluster-controller all pods of components are ready, waiting for the probe detection successful Normal ClusterReady 45m cluster-controller Cluster: strsent-nerqht is ready, current phase is Running Normal Running 45m cluster-controller Cluster: strsent-nerqht is ready, current phase is Running Warning ComponentsNotReady 43m (x2 over 45m) cluster-controller pods are unavailable in Components: [be], refer to related component message in Cluster.status.components Warning ReplicasNotReady 43m (x2 over 45m) cluster-controller pods are not ready in Components: [be], refer to related component message in Cluster.status.components Normal ApplyResourcesSucceed 43m (x6 over 47m) cluster-controller Successfully applied for resources Warning ReplicasNotReady 43m cluster-controller pods are not ready in Components: [be fe], refer to related component message in Cluster.status.components Normal ComponentPhaseTransition 43m (x2 over 43m) cluster-controller component is Updating Normal HorizontalScale 41m (x2 over 41m) component-controller start horizontal scale component fe of cluster strsent-nerqht from 1 to 2 Normal HorizontalScale 33m component-controller start horizontal scale component fe of cluster strsent-nerqht from 2 to 0 Normal HorizontalScale 33m component-controller start horizontal scale component be of cluster strsent-nerqht from 1 to 0 Normal HorizontalScale 32m component-controller start horizontal scale component fe of cluster strsent-nerqht from 0 to 2 Normal HorizontalScale 32m component-controller start horizontal scale component be of cluster strsent-nerqht from 0 to 1 Normal ComponentPhaseTransition 32m (x12 over 45m) cluster-controller component is Running Normal PreCheckSucceed 26m (x11 over 47m) cluster-controller The operator has started the provisioning of Cluster: strsent-nerqht Normal HorizontalScale 26m (x2 over 26m) component-controller start horizontal scale component be of cluster strsent-nerqht from 1 to 2
kubectl describe pod strsent-nerqht-fe-0 Name: strsent-nerqht-fe-0 Namespace: default Priority: 0 Node: gke-infracreate-gke-kbdata-e2-standar-25c8fd47-9yic/10.10.0.70 Start Time: Fri, 28 Jun 2024 09:56:57 +0800 Labels: app.kubernetes.io/component=starrocks-fe-sn app.kubernetes.io/instance=strsent-nerqht app.kubernetes.io/managed-by=kubeblocks app.kubernetes.io/name=starrocks-fe-sn app.kubernetes.io/version=starrocks-fe-sn apps.kubeblocks.io/cluster-uid=9b4a3e93-5c2c-4d6a-8240-f6d742bd3e4c apps.kubeblocks.io/component-name=fe apps.kubeblocks.io/pod-name=strsent-nerqht-fe-0 componentdefinition.kubeblocks.io/name=starrocks-fe-sn controller-revision-hash=6bc67dbc6c workloads.kubeblocks.io/instance=strsent-nerqht-fe workloads.kubeblocks.io/managed-by=InstanceSet Annotations: apps.kubeblocks.io/component-replicas: 2 kubeblocks.io/restart: 2024-06-28T01:50:32Z Status: Pending IP: 10.128.2.114 IPs: IP: 10.128.2.114 Controlled By: InstanceSet/strsent-nerqht-fe Init Containers: init-lorry: Container ID: containerd://8826873260aa831e8f604d768454b96fb08f28bc16c85bff916afc13dd365130 Image: docker.io/apecloud/kubeblocks-tools:0.9.0-beta.39 Image ID: docker.io/apecloud/kubeblocks-tools@sha256:5c137c9ae94ef615be726bbd35df0a31217a3701b1c64e5773321b88e287afa8 Port:
Host Port:
Command:
cp
-r
/bin/lorry
/config
/kubeblocks/
State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 28 Jun 2024 09:56:59 +0800
Finished: Fri, 28 Jun 2024 09:57:01 +0800
Ready: True
Restart Count: 0
Limits:
cpu: 0
memory: 0
Requests:
cpu: 0
memory: 0
Environment Variables from:
strsent-nerqht-fe-env ConfigMap Optional: false
Environment:
STARROCKS_USER: <set to the key 'username' in secret 'strsent-nerqht-fe-account-root'> Optional: false
STARROCKS_PASSWORD: <set to the key 'password' in secret 'strsent-nerqht-fe-account-root'> Optional: false
MYSQL_PWD: <set to the key 'password' in secret 'strsent-nerqht-fe-account-root'> Optional: false
KB_POD_NAME: strsent-nerqht-fe-0 (v1:metadata.name)
KB_POD_UID: (v1:metadata.uid)
KB_NAMESPACE: default (v1:metadata.namespace)
KB_SA_NAME: (v1:spec.serviceAccountName)
KB_NODENAME: (v1:spec.nodeName)
KB_HOST_IP: (v1:status.hostIP)
KB_POD_IP: (v1:status.podIP)
KB_POD_IPS: (v1:status.podIPs)
KB_HOSTIP: (v1:status.hostIP)
KB_PODIP: (v1:status.podIP)
KB_PODIPS: (v1:status.podIPs)
KB_POD_FQDN: $(KB_POD_NAME).strsent-nerqht-fe-headless.$(KB_NAMESPACE).svc
Mounts:
/kubeblocks from kubeblocks (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jljd4 (ro)
starrocks-tools:
Container ID: containerd://5882df76793f3385db1ad3625455e5199cb8b92e6d128d75d0b1f5e68e938a7c
Image: docker.io/apecloud/starrocks-tools:3.2.2
Image ID: docker.io/apecloud/starrocks-tools@sha256:fd9b4e989932b172368cdd1de986845ea96c0d5c19efd4c7fe3bea11bd7aa0f5
Port:
Host Port:
Command:
cp
/bin/mysql
/kb_tools/mysql
State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 28 Jun 2024 09:57:03 +0800
Finished: Fri, 28 Jun 2024 09:57:04 +0800
Ready: True
Restart Count: 0
Limits:
cpu: 0
memory: 0
Requests:
cpu: 0
memory: 0
Environment Variables from:
strsent-nerqht-fe-env ConfigMap Optional: false
Environment:
STARROCKS_USER: <set to the key 'username' in secret 'strsent-nerqht-fe-account-root'> Optional: false
STARROCKS_PASSWORD: <set to the key 'password' in secret 'strsent-nerqht-fe-account-root'> Optional: false
MYSQL_PWD: <set to the key 'password' in secret 'strsent-nerqht-fe-account-root'> Optional: false
KB_POD_NAME: strsent-nerqht-fe-0 (v1:metadata.name)
KB_POD_UID: (v1:metadata.uid)
KB_NAMESPACE: default (v1:metadata.namespace)
KB_SA_NAME: (v1:spec.serviceAccountName)
KB_NODENAME: (v1:spec.nodeName)
KB_HOST_IP: (v1:status.hostIP)
KB_POD_IP: (v1:status.podIP)
KB_POD_IPS: (v1:status.podIPs)
KB_HOSTIP: (v1:status.hostIP)
KB_PODIP: (v1:status.podIP)
KB_PODIPS: (v1:status.podIPs)
KB_POD_FQDN: $(KB_POD_NAME).strsent-nerqht-fe-headless.$(KB_NAMESPACE).svc
TOOLS_SCRIPTS_PATH: /opt/kb-tools/reload/fe-cm
Mounts:
/kb_tools from kb-tools (rw)
/opt/config-manager from config-manager-config (rw)
/opt/kb-tools/reload/fe-cm from cm-script-fe-cm (rw)
/opt/starrocks/fe/conf from fe-cm (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jljd4 (ro)
Containers:
fe:
Container ID:
Image: docker.io/starrocks/fe-ubuntu:3.2.2 Image ID:
Ports: 8030/TCP, 9020/TCP, 9030/TCP, 9010/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP Command: bash -c /opt/starrocks/fe_entrypoint.sh ${FE_DISCOVERY_SERVICE_NAME}
Image: docker.io/starrocks/fe-ubuntu:3.2.2 Image ID:
Ports: 3501/TCP, 50001/TCP Host Ports: 0/TCP, 0/TCP Command: /kubeblocks/lorry --port 3501 --grpcport 50001 --config-path /kubeblocks/config/lorry/components/ State: Waiting Reason: PodInitializing Ready: False Restart Count: 0 Limits: cpu: 0 memory: 0 Requests: cpu: 0 memory: 0 Startup: tcp-socket :3501 delay=0s timeout=1s period=10s #success=1 #failure=3 Environment Variables from: strsent-nerqht-fe-env ConfigMap Optional: false strsent-nerqht-fe-rsm-env ConfigMap Optional: false Environment: STARROCKS_USER: <set to the key 'username' in secret 'strsent-nerqht-fe-account-root'> Optional: false STARROCKS_PASSWORD: <set to the key 'password' in secret 'strsent-nerqht-fe-account-root'> Optional: false MYSQL_PWD: <set to the key 'password' in secret 'strsent-nerqht-fe-account-root'> Optional: false KB_POD_NAME: strsent-nerqht-fe-0 (v1:metadata.name) KB_POD_UID: (v1:metadata.uid) KB_NAMESPACE: default (v1:metadata.namespace) KB_SA_NAME: (v1:spec.serviceAccountName) KB_NODENAME: (v1:spec.nodeName) KB_HOST_IP: (v1:status.hostIP) KB_POD_IP: (v1:status.podIP) KB_POD_IPS: (v1:status.podIPs) KB_HOSTIP: (v1:status.hostIP) KB_PODIP: (v1:status.podIP) KB_PODIPS: (v1:status.podIPs) KB_POD_FQDN: $(KB_POD_NAME).strsent-nerqht-fe-headless.$(KB_NAMESPACE).svc KB_BUILTIN_HANDLER: custom KB_SERVICE_USER: <set to the key 'username' in secret 'strsent-nerqht-fe-account-root'> Optional: false KB_SERVICE_PASSWORD: <set to the key 'password' in secret 'strsent-nerqht-fe-account-root'> Optional: false KB_SERVICE_PORT: 8030 KB_DATA_PATH: /opt/starrocks/fe/meta KB_ACTION_COMMANDS: {"memberLeave":["/bin/bash","-c","#!/usr/bin/env bash\n\nset -x\nset -o errexit\n\nleader_host=\"\"\nleave_member_host=\"\"\nleave_member_port=\"\"\nhelper_endpoints=\"\"\ncandidate_names=\"\"\n\nfunction info() {\n echo \"[$(date +'%Y-%m-%d %H:%M:%S')] $*\"\n}\n\n# root@x-fe-0:/opt/starrocks# mysql -h 127.0.0.1 -P 9030 -e \"show frontends\"\n# +-------------------------------------------------------------------------------+------------------------------------------------------------+-------------+----------+-----------+---------+----------+------------+------+-------+-------------------+---------------------+----------+--------+---------------------+---------------+\n#
Image: docker.io/apecloud/kubeblocks-tools:0.9.0-beta.39 Image ID:
Port: 9901/TCP Host Port: 0/TCP Command: env Args: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:$(TOOLS_PATH) /bin/reloader --log-level info --operator-update-enable --tcp 9901 --config /opt/config-manager/config-manager.yaml State: Waiting Reason: PodInitializing Ready: False Restart Count: 0 Limits: cpu: 0 memory: 0 Requests: cpu: 0 memory: 0 Environment Variables from: strsent-nerqht-fe-env ConfigMap Optional: false strsent-nerqht-fe-rsm-env ConfigMap Optional: false Environment: STARROCKS_USER: <set to the key 'username' in secret 'strsent-nerqht-fe-account-root'> Optional: false STARROCKS_PASSWORD: <set to the key 'password' in secret 'strsent-nerqht-fe-account-root'> Optional: false MYSQL_PWD: <set to the key 'password' in secret 'strsent-nerqht-fe-account-root'> Optional: false KB_POD_NAME: strsent-nerqht-fe-0 (v1:metadata.name) KB_POD_UID: (v1:metadata.uid) KB_NAMESPACE: default (v1:metadata.namespace) KB_SA_NAME: (v1:spec.serviceAccountName) KB_NODENAME: (v1:spec.nodeName) KB_HOST_IP: (v1:status.hostIP) KB_POD_IP: (v1:status.podIP) KB_POD_IPS: (v1:status.podIPs) KB_HOSTIP: (v1:status.hostIP) KB_PODIP: (v1:status.podIP) KB_PODIPS: (v1:status.podIPs) KB_POD_FQDN: $(KB_POD_NAME).strsent-nerqht-fe-headless.$(KB_NAMESPACE).svc CONFIG_MANAGER_POD_IP: (v1:status.podIP) TOOLS_PATH: /opt/kb-tools/reload/fe-cm:/opt/config-manager:/kb_tools Mounts: /kb_tools from kb-tools (rw) /opt/config-manager from config-manager-config (rw) /opt/kb-tools/reload/fe-cm from cm-script-fe-cm (rw) /opt/starrocks/fe/conf from fe-cm (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jljd4 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: log: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit:
SizeLimit:
SizeLimit:
Normal Scheduled 21m default-scheduler Successfully assigned default/strsent-nerqht-fe-0 to gke-infracreate-gke-kbdata-e2-standar-25c8fd47-9yic Normal Pulled 21m kubelet Container image "docker.io/apecloud/kubeblocks-tools:0.9.0-beta.39" already present on machine Normal Created 21m kubelet Created container init-lorry Normal Started 21m kubelet Started container init-lorry Normal Pulled 20m kubelet Container image "docker.io/apecloud/starrocks-tools:3.2.2" already present on machine Normal Created 20m kubelet Created container starrocks-tools Normal Started 20m kubelet Started container starrocks-tools Normal Pulled 20m kubelet Container image "docker.io/starrocks/fe-ubuntu:3.2.2" already present on machine Normal Created 20m kubelet Created container fe Normal Started 20m kubelet Started container fe