bitpoke / mysql-operator

Asynchronous MySQL Replication on Kubernetes using Percona Server and Openark's Orchestrator.
https://www.bitpoke.io/docs/mysql-operator/getting-started/
Apache License 2.0
1.02k stars 274 forks source link

New cluster doesn't happen to end up in orchestrator #673

Open ynnt opened 3 years ago

ynnt commented 3 years ago

Cluster is stuck in Ready: False phase because mysql pod never gets Ready.


2021-04-07T14:52:54.553072021Z  DEBUG   controller-runtime.manager.events   Normal  {"object": {"kind":"Lease","namespace":"default","name":"mysql-operator-leader-election","uid":"c61dd4bc-4706-4b5e-9c46-b2267447f087","apiVersion":"coordination.k8s.io/v1","resourceVersion":"78295"}, "reason": "LeaderElection", "message": "cm-mysql-operator-0_750f6e52-877d-473c-9935-69d30acfd952 became leader"}
2021-04-07T14:52:54.553275205Z  INFO    controller-runtime.manager.controller.mysqlbackup-controller    Starting EventSource    {"source": "kind source: /, Kind="}
2021-04-07T14:52:54.553360761Z  INFO    controller-runtime.manager.controller.mysqlbackup-controller    Starting EventSource    {"source": "kind source: /, Kind="}
2021-04-07T14:52:54.553824764Z  INFO    controller-runtime.manager.controller.mysqlbackupcron-controller    Starting EventSource    {"source": "kind source: /, Kind="}
2021-04-07T14:52:54.554015469Z  INFO    controller-runtime.manager.controller.controller.mysqlcluster   Starting EventSource    {"source": "kind source: /, Kind="}
2021-04-07T14:52:54.554121207Z  INFO    controller-runtime.manager.controller.mysql-database    Starting EventSource    {"source": "kind source: /, Kind="}
2021-04-07T14:52:54.554335196Z  INFO    controller-runtime.manager.controller.mysql-user    Starting EventSource    {"source": "kind source: /, Kind="}
2021-04-07T14:52:54.554562241Z  INFO    controller-runtime.manager.controller.controller.mysqlNode  Starting EventSource    {"source": "kind source: /, Kind="}
2021-04-07T14:52:54.554786339Z  INFO    controller-runtime.manager.controller.controller.orchestrator   Starting EventSource    {"source": "kind source: /, Kind="}
2021-04-07T14:52:54.654175743Z  INFO    controller-runtime.manager.controller.mysqlbackup-controller    Starting Controller
2021-04-07T14:52:54.65433261Z   INFO    controller-runtime.manager.controller.mysql-database    Starting Controller
2021-04-07T14:52:54.654463104Z  INFO    controller-runtime.manager.controller.mysqlbackupcron-controller    Starting Controller
2021-04-07T14:52:54.654492149Z  INFO    controller-runtime.manager.controller.mysqlbackupcron-controller    Starting workers    {"worker count": 1}
2021-04-07T14:52:54.654561787Z  INFO    controller-runtime.manager.controller.mysql-user    Starting Controller
2021-04-07T14:52:54.654585036Z  INFO    controller-runtime.manager.controller.mysql-user    Starting workers    {"worker count": 1}
2021-04-07T14:52:54.654638173Z  INFO    controller-runtime.manager.controller.controller.mysqlcluster   Starting EventSource    {"source": "kind source: /, Kind="}
2021-04-07T14:52:54.654938043Z  INFO    controller-runtime.manager.controller.controller.orchestrator   Starting EventSource    {"source": "channel source: 0xc00022c2d0"}
2021-04-07T14:52:54.655058696Z  INFO    controller-runtime.manager.controller.controller.orchestrator   Starting Controller
2021-04-07T14:52:54.655927767Z  INFO    controller-runtime.manager.controller.controller.mysqlNode  Starting Controller
2021-04-07T14:52:54.656018901Z  DEBUG   controller.orchestrator register cluster in clusters list   {"obj": {"kind":"MysqlCluster","apiVersion":"mysql.presslabs.org/v1alpha1","metadata":{"name":"kl-my","namespace":"default","uid":"5a09c95d-a977-4a4b-94e3-a97209938043","resourceVersion":"74547","generation":1,"creationTimestamp":"2021-04-07T14:38:02Z","annotations":{"mysql.presslabs.org/version":"300"},"ownerReferences":[{"apiVersion":"kuberlogic.com/v1","kind":"KuberLogicService","name":"kl-my","uid":"9db08315-5aed-4f29-8c18-aa3e95ceb053","controller":true,"blockOwnerDeletion":true}],"finalizers":["mysql.presslabs.org/registered-in-orchestrator"],"managedFields":[{"manager":"operator","operation":"Update","apiVersion":"mysql.presslabs.org/v1alpha1","time":"2021-04-07T14:38:02Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:ownerReferences":{}},"f:spec":{".":{},"f:image":{},"f:podSpec":{".":{},"f:annotations":{".":{},"f:monitoring.cloudlinux.com/port":{},"f:monitoring.cloudlinux.com/scrape":{}},"f:containers":{},"f:imagePullSecrets":{},"f:initContainers":{},"f:metricsExporterResources":{".":{},"f:limits":{".":{},"f:cpu":{},"f:memory":{}},"f:requests":{".":{},"f:cpu":{},"f:memory":{}}},"f:mysqlOperatorSidecarResources":{".":{},"f:requests":{".":{},"f:cpu":{},"f:memory":{}}},"f:resources":{".":{},"f:limits":{".":{},"f:cpu":{},"f:memory":{}},"f:requests":{".":{},"f:cpu":{},"f:memory":{}}}},"f:replicas":{},"f:secretName":{},"f:volumeSpec":{".":{},"f:persistentVolumeClaim":{".":{},"f:resources":{".":{},"f:requests":{".":{},"f:storage":{}}}}}}}},{"manager":"mysql-operator","operation":"Update","apiVersion":"mysql.presslabs.org/v1alpha1","time":"2021-04-07T14:38:06Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotations":{".":{},"f:mysql.presslabs.org/version":{}},"f:finalizers":{}},"f:spec":{"f:minAvailable":{},"f:podSpec":{"f:mysqlOperatorSidecarResources":{"f:limits":{".":{},"f:cpu":{},"f:memory":{}}}},"f:volumeSpec":{"f:persistentVolumeClaim":{"f:accessModes":{}}}},"f:status":{".":{},"f:conditions":{}}}}]},"spec":{"replicas":2,"secretName":"kl-my-cred","image":"quay.io/kuberlogic/mysql:5.7.26","podSpec":{"imagePullSecrets":[{"name":"kuberlogic-registry"}],"annotations":{"monitoring.cloudlinux.com/port":"9999","monitoring.cloudlinux.com/scrape":"true"},"resources":{"limits":{"cpu":"100m","memory":"512Mi"},"requests":{"cpu":"10m","memory":"256Mi"}},"initContainers":[{"name":"myisam-repair","image":"quay.io/kuberlogic/mysql:5.7.26","command":["/bin/sh","-c","for f in $(ls /var/lib/mysql/mysql/*MYI); do myisamchk -r --update-state $(echo $f | tr -d .MYI); done"],"resources":{},"volumeMounts":[{"name":"data","mountPath":"/var/lib/mysql"}]}],"containers":[{"name":"kuberlogic-exporter","image":"quay.io/kuberlogic/mysql-exporter-deprecated:v2","ports":[{"name":"metrics","containerPort":9999,"protocol":"TCP"}],"resources":{},"volumeMounts":[{"name":"data","mountPath":"/var/lib/mysql"}]}],"metricsExporterResources":{"limits":{"cpu":"100m","memory":"128Mi"},"requests":{"cpu":"10m","memory":"32Mi"}},"mysqlOperatorSidecarResources":{"requests":{"cpu":"10m","memory":"64Mi"}}},"volumeSpec":{"persistentVolumeClaim":{"resources":{"requests":{"storage":"1Gi"}}}}},"status":{"conditions":[{"type":"ReadOnly","status":"True","lastTransitionTime":"2021-04-07T14:38:06Z","reason":"ClusterReadOnlyTrue","message":"read-only nodes: "},{"type":"Ready","status":"False","lastTransitionTime":"2021-04-07T14:38:06Z","reason":"StatefulSetNotReady","message":"StatefulSet is not ready"},{"type":"PendingFailoverAck","status":"False","lastTransitionTime":"2021-04-07T14:38:06Z","reason":"NoPendingFailoverAckExists","message":"no pending ack"}]}}}
2021-04-07T14:52:54.65837839Z   INFO    controller-runtime.manager.controller.mysql-database    Starting workers    {"worker count": 1}
2021-04-07T14:52:54.755666407Z  INFO    controller-runtime.manager.controller.mysqlbackup-controller    Starting workers        {"worker count": 1}
2021-04-07T14:52:54.755720705Z  INFO    controller-runtime.manager.controller.controller.orchestrator   Starting workers        {"worker count": 10}
2021-04-07T14:52:54.757668027Z  INFO    controller-runtime.manager.controller.controller.mysqlNode  Starting workers        {"worker count": 1}
2021-04-07T14:52:54.757705105Z  INFO    controller-runtime.manager.controller.controller.mysqlcluster   Starting EventSource    {"source": "kind source: /, Kind="}
2021-04-07T14:52:54.858307211Z  INFO    controller-runtime.manager.controller.controller.mysqlcluster   Starting EventSource    {"source": "kind source: /, Kind="}
2021-04-07T14:52:54.959827321Z  INFO    controller-runtime.manager.controller.controller.mysqlcluster   Starting EventSource    {"source": "kind source: /, Kind="}
2021-04-07T14:52:55.060318519Z  INFO    controller-runtime.manager.controller.controller.mysqlcluster   Starting EventSource    {"source": "kind source: /, Kind="}
2021-04-07T14:52:55.161471864Z  INFO    controller-runtime.manager.controller.controller.mysqlcluster   Starting Controller
2021-04-07T14:52:55.161604803Z  INFO    controller-runtime.manager.controller.controller.mysqlcluster   Starting workers        {"worker count": 1}
2021-04-07T14:52:55.161879663Z  DEBUG   controller.mysqlcluster reconcile cluster   {"key": "default/kl-my"}
2021-04-07T14:52:55.163217074Z  DEBUG   unchanged   {"syncer": "ConfigMap", "key": {"namespace": "default", "name": "kl-my-mysql"}, "kind": "/v1, Kind=ConfigMap", "diff": []}
2021-04-07T14:52:55.163743132Z  DEBUG   unchanged   {"syncer": "OperatedSecret", "key": {"namespace": "default", "name": "kl-my-mysql-operated"}, "kind": "/v1, Kind=Secret", "diff": []}
2021-04-07T14:52:55.164085532Z  DEBUG   unchanged   {"syncer": "Secret", "key": {"namespace": "default", "name": "kl-my-cred"}, "kind": "/v1, Kind=Secret", "diff": []}
2021-04-07T14:52:55.16461333Z   DEBUG   unchanged   {"syncer": "HeadlessSVC", "key": {"namespace": "default", "name": "mysql"}, "kind": "/v1, Kind=Service", "diff": []}
2021-04-07T14:52:55.166243362Z  DEBUG   unchanged   {"syncer": "MasterSVC", "key": {"namespace": "default", "name": "kl-my-mysql-master"}, "kind": "/v1, Kind=Service", "diff": []}
2021-04-07T14:52:55.1668702Z    DEBUG   unchanged   {"syncer": "HealthySVC", "key": {"namespace": "default", "name": "kl-my-mysql"}, "kind": "/v1, Kind=Service", "diff": []}
2021-04-07T14:52:55.167596425Z  DEBUG   unchanged   {"syncer": "HealthyReplicasSVC", "key": {"namespace": "default", "name": "kl-my-mysql-replicas"}, "kind": "/v1, Kind=Service", "diff": []}
2021-04-07T14:52:55.208066905Z  DEBUG   updated {"syncer": "StatefulSet", "key": {"namespace": "default", "name": "kl-my-mysql"}, "kind": "apps/v1, Kind=StatefulSet", "diff": []}
2021-04-07T14:52:55.20854835Z   DEBUG   unchanged   {"syncer": "PDB", "key": {"namespace": "default", "name": "kl-my-mysql"}, "kind": "policy/v1beta1, Kind=PodDisruptionBudget", "diff": []}
2021-04-07T14:52:55.2085749Z    DEBUG   controller.mysqlcluster cluster status  {"key": "default/kl-my", "status": {"conditions":[{"type":"ReadOnly","status":"True","lastTransitionTime":"2021-04-07T14:38:06Z","reason":"ClusterReadOnlyTrue","message":"read-only nodes: "},{"type":"Ready","status":"False","lastTransitionTime":"2021-04-07T14:38:06Z","reason":"StatefulSetNotReady","message":"StatefulSet is not ready"},{"type":"PendingFailoverAck","status":"False","lastTransitionTime":"2021-04-07T14:38:06Z","reason":"NoPendingFailoverAckExists","message":"no pending ack"}]}}
2021-04-07T14:52:55.208888335Z  DEBUG   controller-runtime.manager.events   Normal  {"object": {"kind":"MysqlCluster","namespace":"default","name":"kl-my","uid":"5a09c95d-a977-4a4b-94e3-a97209938043","apiVersion":"mysql.presslabs.org/v1alpha1","resourceVersion":"74547"}, "reason": "StatefulSetSyncSuccessfull", "message": "apps/v1, Kind=StatefulSet default/kl-my-mysql updated successfully"}
2021-04-07T14:52:55.310250803Z  DEBUG   controller.mysqlcluster reconcile cluster   {"key": "default/kl-my"}
2021-04-07T14:52:55.311133499Z  DEBUG   unchanged   {"syncer": "ConfigMap", "key": {"namespace": "default", "name": "kl-my-mysql"}, "kind": "/v1, Kind=ConfigMap", "diff": []}
2021-04-07T14:52:55.311607029Z  DEBUG   unchanged   {"syncer": "OperatedSecret", "key": {"namespace": "default", "name": "kl-my-mysql-operated"}, "kind": "/v1, Kind=Secret", "diff": []}
2021-04-07T14:52:55.311900251Z  DEBUG   unchanged   {"syncer": "Secret", "key": {"namespace": "default", "name": "kl-my-cred"}, "kind": "/v1, Kind=Secret", "diff": []}
2021-04-07T14:52:55.312375685Z  DEBUG   unchanged   {"syncer": "HeadlessSVC", "key": {"namespace": "default", "name": "mysql"}, "kind": "/v1, Kind=Service", "diff": []}
2021-04-07T14:52:55.313117698Z  DEBUG   unchanged   {"syncer": "MasterSVC", "key": {"namespace": "default", "name": "kl-my-mysql-master"}, "kind": "/v1, Kind=Service", "diff": []}
2021-04-07T14:52:55.313745106Z  DEBUG   unchanged   {"syncer": "HealthySVC", "key": {"namespace": "default", "name": "kl-my-mysql"}, "kind": "/v1, Kind=Service", "diff": []}
2021-04-07T14:52:55.314438017Z  DEBUG   unchanged   {"syncer": "HealthyReplicasSVC", "key": {"namespace": "default", "name": "kl-my-mysql-replicas"}, "kind": "/v1, Kind=Service", "diff": []}
2021-04-07T14:52:55.34431649Z   DEBUG   updated {"syncer": "StatefulSet", "key": {"namespace": "default", "name": "kl-my-mysql"}, "kind": "apps/v1, Kind=StatefulSet", "diff": []}
2021-04-07T14:52:55.344757091Z  DEBUG   unchanged   {"syncer": "PDB", "key": {"namespace": "default", "name": "kl-my-mysql"}, "kind": "policy/v1beta1, Kind=PodDisruptionBudget", "diff": []}
2021-04-07T14:52:55.344790801Z  DEBUG   controller.mysqlcluster cluster status  {"key": "default/kl-my", "status": {"conditions":[{"type":"ReadOnly","status":"True","lastTransitionTime":"2021-04-07T14:38:06Z","reason":"ClusterReadOnlyTrue","message":"read-only nodes: "},{"type":"Ready","status":"False","lastTransitionTime":"2021-04-07T14:38:06Z","reason":"StatefulSetNotReady","message":"StatefulSet is not ready"},{"type":"PendingFailoverAck","status":"False","lastTransitionTime":"2021-04-07T14:38:06Z","reason":"NoPendingFailoverAckExists","message":"no pending ack"}]}}
2021-04-07T14:52:55.345053401Z  DEBUG   controller-runtime.manager.events   Normal  {"object": {"kind":"MysqlCluster","namespace":"default","name":"kl-my","uid":"5a09c95d-a977-4a4b-94e3-a97209938043","apiVersion":"mysql.presslabs.org/v1alpha1","resourceVersion":"74547"}, "reason": "StatefulSetSyncSuccessfull", "message": "apps/v1, Kind=StatefulSet default/kl-my-mysql updated successfully"}
2021-04-07T14:52:59.553012175Z  DEBUG   controller.orchestrator Schedule new cluster for reconciliation {"key": "default/kl-my"}
2021-04-07T14:52:59.553225885Z  DEBUG   controller.orchestrator reconciling cluster {"key": "default/kl-my"}
2021-04-07T14:52:59.554547195Z  DEBUG   unchanged   {"syncer": "OrchestratorFinalizerSyncer", "key": {"namespace": "default", "name": "kl-my"}, "kind": "mysql.presslabs.org/v1alpha1, Kind=MysqlCluster", "diff": []}
2021-04-07T14:52:59.56895656Z   WARNING orchestrator-reconciler cluster not found in Orchestrator   {"key": "default/kl-my", "error": "not found"}
github.com/go-logr/zapr.(*zapLogger).Info
    /go/pkg/mod/github.com/go-logr/zapr@v0.4.0/zapr.go:126
github.com/presslabs/mysql-operator/pkg/controller/orchestrator.(*orcUpdater).getFromOrchestrator
    /go/src/github.com/presslabs/mysql-operator/pkg/controller/orchestrator/orchestrator_reconcile.go:133
github.com/presslabs/mysql-operator/pkg/controller/orchestrator.(*orcUpdater).Sync
    /go/src/github.com/presslabs/mysql-operator/pkg/controller/orchestrator/orchestrator_reconcile.go:83
github.com/presslabs/controller-util/syncer.Sync
    /go/pkg/mod/github.com/presslabs/controller-util@v0.3.0-alpha.2/syncer/syncer.go:82
github.com/presslabs/mysql-operator/pkg/controller/orchestrator.(*ReconcileMysqlCluster).Reconcile
    /go/src/github.com/presslabs/mysql-operator/pkg/controller/orchestrator/orchestrator_controller.go:216
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/internal/controller/controller.go:298
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/internal/controller/controller.go:216
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1
    /go/pkg/mod/k8s.io/apimachinery@v0.20.4/pkg/util/wait/wait.go:185
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1
    /go/pkg/mod/k8s.io/apimachinery@v0.20.4/pkg/util/wait/wait.go:155
k8s.io/apimachinery/pkg/util/wait.BackoffUntil
    /go/pkg/mod/k8s.io/apimachinery@v0.20.4/pkg/util/wait/wait.go:156
k8s.io/apimachinery/pkg/util/wait.JitterUntil
    /go/pkg/mod/k8s.io/apimachinery@v0.20.4/pkg/util/wait/wait.go:133
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext
    /go/pkg/mod/k8s.io/apimachinery@v0.20.4/pkg/util/wait/wait.go:185
k8s.io/apimachinery/pkg/util/wait.UntilWithContext
    /go/pkg/mod/k8s.io/apimachinery@v0.20.4/pkg/util/wait/wait.go:99```
jicki commented 3 years ago

Same problem

nigh8w0lf commented 3 years ago

Yes and logs in orchestrator show - Unable to determine cluster name

This is for a brand new cluster

nigh8w0lf commented 3 years ago

deployed the cluster in a different namespace and also tried in same namespace as operator, same result. Also tried changing the name of the cluster but has same issues as above.

sagikazarmark commented 3 years ago

I have a similar issue: the cluster starts, but after some time the mysql becomes non-ready and I get the above log message in the operator logs.

browol commented 3 years ago

I have this problem too

iefc commented 2 years ago

Same problem

calind commented 2 years ago

Please make sure you are not hitting #170. (see https://www.bitpoke.io/docs/mysql-operator/deploy-mysql-cluster/#note-1).

Also please try with v0.5.0.

tebaly commented 1 year ago

Hello. In my case:

EVERYTHING WORKED, BUT there were errors in MySQL clusters only.

Obviously, I figured the problem was mysql-operator - no changes helped at all. Everything worked, but MySQL clusters gradually stopped working. Horror...

RUN Kubespray upgrade-cluster.yml

An error occurred - not deleted pod with MySQL cluster. The same error was at the very beginning when I tried to fix the cluster K8S. I ignored her then. This happened at the stage "Drain node"

fatal: [node1]: FAILED! => {"attempts": 3, "changed": true, "cmd": ["/usr/local/bin/kubectl", "--kubeconfig", "/etc/kubernetes/admin.conf", "drain", "--force", "--ignore-daemonsets", "--grace-period", "300", "--timeout", "360s", "--delete-emptydir-data", "node1"], "delta": "0:06:01.760844", "end": "2022-10-05 02:44:14.018346", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2022-10-05 02:38:12.257502", "stderr": "WARNING: ignoring DaemonSet-managed Pods: default/netchecker-agent-hostnet-xvkjz, default/netchecker-agent-w282k, *** \nerror when evicting pods/\"***-mysql-0\" -n \"***\" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.\nerror when evicting pods/\"***-mysql-0\" -n \"***\"

Kubespray unable to upgrade the cluster completely - in my case that was the reason.

Solution (in my case)

  1. RUN Kubespray upgrade-cluster.yml
  2. Follow the process to the stage "Drain node" each node
  3. The process will hang at this stage and wait for a long time
  4. Delete all MySQL pods from this node
  5. The process will move forward
  6. The K8S cluster will be updated without errors and everything will work
oau-dev commented 7 months ago

hello

same problem here. 77 clusters deployed without problem but one of them does not want to deploy the second node because "cluster not found in Orchestrator". No other error at all

oau-dev commented 7 months ago

hello, I found that some data are still in the sqlite db after days of cluster deletion. in database_instance_last_analysis , database_instance_tls, kv_store, hostname_ips.

thx for your help, I'm really stuck here :(