Kubegres is a Kubernetes operator allowing to deploy one or many clusters of PostgreSql instances and manage databases replication, failover and backup.
Hi, was hoping to get some clarity on how I can remediate the following. It appears something is out of sync w/ my kubegres operator where the custom resource status' previous blocking operation is not removed / fulfilled when the relevant statefulset becomes ready.
Initial state:
$ kubectl get sts | grep postgres
postgres-9 1/1 144d
2022-07-06T20:06:04.673Z INFO controllers.Kubegres Active Blocking-Operation: None
2022-07-06T20:06:04.673Z INFO controllers.Kubegres Previous Blocking-Operation {"OperationId": "Replica DB count spec enforcement", "StepId": "Replica DB is deploying", "HasTimedOut": false, "StatefulSetInstanceIndex": 9}
2022-07-06T20:06:04.673Z INFO controllers.Kubegres Database StorageClass states. {"IsDeployed": true, "name": "rubix-aws-provisioner-v4"}
2022-07-06T20:06:04.673Z INFO controllers.Kubegres Base Config states {"IsDeployed": true, "name": "base-kubegres-config"}
2022-07-06T20:06:04.673Z INFO controllers.Kubegres All StatefulSets deployment states: {"Spec expected to deploy": 3, "Nbre Deployed": 1}
2022-07-06T20:06:04.673Z INFO controllers.Kubegres Primary states: {"IsDeployed": true, "Name": "postgres-9", "IsReady": true, "Pod Name": "postgres-9-0", "Pod IsDeployed": true, "Pod IsReady": true, "Pod IsStuck": false}
2022-07-06T20:06:04.673Z INFO controllers.Kubegres Primary Service states: {"IsDeployed": true, "name": "postgres"}
2022-07-06T20:06:04.673Z INFO controllers.Kubegres Replica Service states: {"IsDeployed": true, "name": "postgres-replica"}
2022-07-06T20:06:04.673Z INFO controllers.Kubegres BackUp states. {"IsCronJobDeployed": false, "IsPvcDeployed": false, "CronJobLastScheduleTime": ""}
2022-07-06T20:06:04.673Z INFO controllers.Kubegres We are going to deploy 2 Replica statefulSet(s)
2022-07-06T20:06:04.678Z INFO controllers.Kubegres Deploying Replica statefulSet 'postgres-9'
2022-07-06T20:06:04.696Z ERROR controllers.Kubegres Unable to deploy Replica StatefulSet. {"Replica name": "postgres-9", "error": "statefulsets.apps \"postgres-9\" already exists"}
Removing status as a subresource from the CRD so I could manually update it (to remove the previous operation and increment the last created index) resulted in the next replica statefulset getting created. However, now the operator seems to just be stuck on trying to recreate that statefulset:
postgres-10-0 1/1 Running 0 5m32s
postgres-9-0 1/1 Running 0 12h
status:
blockingOperation:
statefulSetOperation: {}
statefulSetSpecUpdateOperation: {}
enforcedReplicas: 9
lastCreatedInstanceIndex: 9
previousBlockingOperation:
operationId: Replica DB count spec enforcement
statefulSetOperation:
instanceIndex: 10
name: postgres-10
statefulSetSpecUpdateOperation: {}
stepId: Replica DB is deploying
timeOutEpocInSeconds: 1657208439
2022-07-07T15:36:18.935Z INFO controllers.Kubegres Updating Kubegres' status: {"Field": "PreviousBlockingOperation", "New value": {"operationId":"Replica DB count spec enforcement","stepId":"Replica DB is deploying","timeOutEpocInSeconds":1657208478,"statefulSetOperation":{"instanceIndex":10,"name":"postgres-10"},"statefulSetSpecUpdateOperation":{}}}
2022-07-07T15:36:18.935Z DEBUG controller-runtime.manager.events Warning {"object": {"kind":"Kubegres","namespace":"data-oregano","name":"postgres","uid":"171e2976-6510-4cac-8ba6-79c17d03afe8","apiVersion":"kubegres.reactive-tech.io/v1","resourceVersion":"2730195589"}, "reason": "ReplicaStatefulSetDeploymentErr", "message": "Unable to deploy Replica StatefulSet. 'Replica name': postgres-10 - statefulsets.apps \"postgres-10\" already exists"}
Hi, was hoping to get some clarity on how I can remediate the following. It appears something is out of sync w/ my kubegres operator where the custom resource status' previous blocking operation is not removed / fulfilled when the relevant statefulset becomes ready.
Initial state:
Custom resource status
Logs
Removing status as a subresource from the CRD so I could manually update it (to remove the previous operation and increment the last created index) resulted in the next replica statefulset getting created. However, now the operator seems to just be stuck on trying to recreate that statefulset:
Any help would be appreciated, thanks!