oceanbase / ob-operator

Kubernetes operator for OceanBase
https://oceanbase.github.io/ob-operator/
Other
142 stars 37 forks source link

[Bug]: Deploy obcluster failed with latest version #499

Open silentacorn opened 1 month ago

silentacorn commented 1 month ago

Describe the bug

Try deploy a simple oceanbase cluster by ob-operator but always failed as follow error logs while pods shows running already:

{"level":"INFO","ts":"2024-08-01T19:41:02+08:00","msg":"Create observers","controller":"obzone","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBZone","OBZone":{"name":"obcluster-1-zone1","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone1","reconcileID":"fe450abf-de5d-440b-b5ed-e7b6fb85d396"} {"level":"INFO","ts":"2024-08-01T19:41:02+08:00","msg":"Create observer","controller":"obzone","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBZone","OBZone":{"name":"obcluster-1-zone1","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone1","reconcileID":"fe450abf-de5d-440b-b5ed-e7b6fb85d396","server":"obcluster-1-zone1-272987e9035e"} {"level":"INFO","ts":"2024-08-01T19:41:02+08:00","msg":"Newly created server, init status","controller":"observer","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBServer","OBServer":{"name":"obcluster-1-zone1-272987e9035e","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone1-272987e9035e","reconcileID":"cab46ab5-f960-4474-be95-2a0dadbdf53a"} {"level":"INFO","ts":"2024-08-01T19:41:02+08:00","msg":"Create observers","controller":"obzone","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBZone","OBZone":{"name":"obcluster-1-zone2","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone2","reconcileID":"17e81701-0ccd-43ba-b3b2-bf99b6ed0623"} {"level":"INFO","ts":"2024-08-01T19:41:02+08:00","msg":"Create observer","controller":"obzone","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBZone","OBZone":{"name":"obcluster-1-zone2","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone2","reconcileID":"17e81701-0ccd-43ba-b3b2-bf99b6ed0623","server":"obcluster-1-zone2-c557ffed90ab"} {"level":"INFO","ts":"2024-08-01T19:41:02+08:00","msg":"Newly created server, init status","controller":"observer","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBServer","OBServer":{"name":"obcluster-1-zone2-c557ffed90ab","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone2-c557ffed90ab","reconcileID":"ec6f9e99-3765-4ac8-897e-f0a00a0a6a9f"} {"level":"INFO","ts":"2024-08-01T19:41:02+08:00","msg":"Create observers","controller":"obzone","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBZone","OBZone":{"name":"obcluster-1-zone3","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone3","reconcileID":"09711cad-b115-48a5-b497-846cb93840aa"} {"level":"INFO","ts":"2024-08-01T19:41:02+08:00","msg":"Create observer","controller":"obzone","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBZone","OBZone":{"name":"obcluster-1-zone3","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone3","reconcileID":"09711cad-b115-48a5-b497-846cb93840aa","server":"obcluster-1-zone3-b92a1921f4dd"} {"level":"INFO","ts":"2024-08-01T19:41:02+08:00","msg":"Newly created server, init status","controller":"observer","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBServer","OBServer":{"name":"obcluster-1-zone3-b92a1921f4dd","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone3-b92a1921f4dd","reconcileID":"e628e425-3d9d-4afd-a21f-83ca582afe83"} {"level":"INFO","ts":"2024-08-01T19:41:03+08:00","msg":"Create observer when create obcluster","controller":"observer","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBServer","OBServer":{"name":"obcluster-1-zone1-272987e9035e","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone1-272987e9035e","reconcileID":"0eac2fb7-ec9f-4400-a6c3-3433eec52d7d"} {"level":"INFO","ts":"2024-08-01T19:41:03+08:00","msg":"Create observer when create obcluster","controller":"observer","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBServer","OBServer":{"name":"obcluster-1-zone2-c557ffed90ab","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone2-c557ffed90ab","reconcileID":"a3366ec0-475f-422f-b976-60f4e6d67eb9"} {"level":"INFO","ts":"2024-08-01T19:41:03+08:00","msg":"Create observer when create obcluster","controller":"observer","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBServer","OBServer":{"name":"obcluster-1-zone3-b92a1921f4dd","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone3-b92a1921f4dd","reconcileID":"34dbd673-0d8c-4272-9a51-77c6e1930ae3"} {"level":"INFO","ts":"2024-08-01T19:41:10+08:00","msg":"static ip not supported, set empty annotation","controller":"observer","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBServer","OBServer":{"name":"obcluster-1-zone1-272987e9035e","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone1-272987e9035e","reconcileID":"095a8d75-4893-493d-ada8-f6d30f5976b4"} {"level":"INFO","ts":"2024-08-01T19:41:10+08:00","msg":"static ip not supported, set empty annotation","controller":"observer","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBServer","OBServer":{"name":"obcluster-1-zone2-c557ffed90ab","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone2-c557ffed90ab","reconcileID":"03fccac2-3b74-457b-b2ca-46109752f957"} {"level":"INFO","ts":"2024-08-01T19:41:10+08:00","msg":"static ip not supported, set empty annotation","controller":"observer","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBServer","OBServer":{"name":"obcluster-1-zone3-b92a1921f4dd","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone3-b92a1921f4dd","reconcileID":"ad92b607-f1ed-412e-9e69-dda2adf0e4c0"} {"level":"INFO","ts":"2024-08-01T19:41:23+08:00","msg":"Pod is ready","controller":"observer","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBServer","OBServer":{"name":"obcluster-1-zone1-272987e9035e","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone1-272987e9035e","reconcileID":"0033dca3-0681-44ac-b720-9a3b0bcc2843"} {"level":"INFO","ts":"2024-08-01T19:41:25+08:00","msg":"Pod is ready","controller":"observer","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBServer","OBServer":{"name":"obcluster-1-zone2-c557ffed90ab","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone2-c557ffed90ab","reconcileID":"9dd82dbe-0c90-4a08-ba8d-808ea5bb7133"} {"level":"INFO","ts":"2024-08-01T19:41:33+08:00","msg":"Pod is ready","controller":"observer","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBServer","OBServer":{"name":"obcluster-1-zone3-b92a1921f4dd","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster-1-zone3-b92a1921f4dd","reconcileID":"c5b30b83-a796-4e88-9ed9-6ede5e9e1252"} {"level":"ERROR","ts":"2024-08-01T19:42:12+08:00","msg":"get oceanbase operation manager failed","controller":"obcluster","controllerGroup":"oceanbase.oceanbase.com","controllerKind":"OBCluster","OBCluster":{"name":"obcluster","namespace":"oceanbase"},"namespace":"oceanbase","name":"obcluster","reconcileID":"44f8cc7e-6701-4fa7-8f20-788b32676d0f","error":"Can not get oceanbase operation manager of obcluster obcluster after checked all server","errorVerbose":"Can not get oceanbase operation manager of obcluster obcluster after checked all server\ngithub.com/oceanbase/ob-operator/internal/resource/utils.getSysClient\n\t/workspace/internal/resource/utils/util.go:138\ngithub.com/oceanbase/ob-operator/internal/resource/utils.GetSysOperationClient\n\t/workspace/internal/resource/utils/util.go:53\ngithub.com/oceanbase/ob-operator/internal/resource/obcluster.(OBClusterManager).getOceanbaseOperationManager\n\t/workspace/internal/resource/obcluster/obcluster_task.go:266\ngithub.com/oceanbase/ob-operator/internal/resource/obcluster.(OBClusterManager).Bootstrap\n\t/workspace/internal/resource/obcluster/obcluster_task.go:281\ngithub.com/oceanbase/ob-operator/pkg/task.runTask\n\t/workspace/pkg/task/task_manager.go:63\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598","stacktrace":"github.com/oceanbase/ob-operator/internal/resource/obcluster.(*OBClusterManager).Bootstrap\n\t/workspace/internal/resource/obcluster/obcluster_task.go:291\ngithub.com/oceanbase/ob-operator/pkg/task.runTask\n\t/workspace/pkg/task/task_manager.go:63"}

The obcluster.yaml file: ` apiVersion: oceanbase.oceanbase.com/v1alpha1 kind: OBCluster metadata: name: obcluster namespace: oceanbase spec: clusterName: obcluster clusterId: 1 serviceAccount: "default" userSecrets: root: root-password proxyro: proxyro-password monitor: monitor-password operator: operator-password topology:

Environment

ob-operator:2.2.2 observer: image: oceanbase/oceanbase-cloud-native:4.2.1.6-106000012024042515 monitor: image: oceanbase/obagent:4.2.1-100000092023101717 k8s: v1.28.2

Fast reproduce steps

  1. deploy ob-operator 2.2.2 with helm in oceanbase-system namespace
  2. create security
  3. deploy oceanbase by apply -f obcluster.yaml in oceanbase namespace

Expected behavior

No response

Actual behavior

kubectl -n oceanbase get obcluster NAME STATUS AGE obcluster failed 14m

kubectl -n oceanbase describe obcluster obcluster ... Events: Type Reason Age From Message


Normal 13m obcluster-controller newly created cluster, init status Normal 13m obcluster-controller Create obzone obcluster-1-zone1 successfully Normal 13m obcluster-controller Create obzone obcluster-1-zone2 successfully Normal 13m obcluster-controller Create obzone obcluster-1-zone3 successfully Warning Task failed 12m obcluster-controller get oceanbase operation manager: Can not get oceanbase operation manager of obcluster obcluster after checked all server

Additional context

No response

chris-sun-star commented 1 month ago

Pod turned to running status only indicates that the observer process is up and listens the desired part, but there's still extra steps to successfully bootstrap an OceanBase cluster. The events show that ob-operator failed to obtain a connection after tried all the observers, and the time was about 1 minute after obzone was created, maybe observer is not ready to handle connection cause it still has to do some initializing procedure after the process is started. You may try to connect to the observer using mysql client using root user and without password, see if the observer is connectable, and get the first log file of observer, the file may be observer.log.20.... We've encountered some similar issues due to slow IO, you may also check the disk iops as well.