percona / percona-server-mongodb-operator

Percona Operator for MongoDB
https://www.percona.com/doc/kubernetes-operator-for-psmongodb/
Apache License 2.0
321 stars 138 forks source link

K8SPSMDB-1170: set error state on failed reconcile of replset #1651

Closed pooknull closed 1 week ago

pooknull commented 3 weeks ago

K8SPSMDB-1170 Powered by Pull Request Badge

https://perconadev.atlassian.net/browse/K8SPSMDB-1170

CHANGE DESCRIPTION

Problem: When spec.tls.allowInvalidCertificates is set to false, the operator is stuck on initializing replsets. There are errors in the log, but the cluster doesn't have .status.state set to error.

Solution: Set error state, when reconcileCluster method fails.

CHECKLIST

Jira

Tests

Config/Logging/Testability

egegunes commented 2 weeks ago

this doesn't fix the real problem. yes, now we're setting cluster state to error but operator still can't recover if you set tls.allowInvalidCertificates=false. Operator is now stuck in error state rather than initializing state.

pooknull commented 1 week ago

@egegunes I agree that this PR doesn't fix K8SPSMDB-1156. But at least setting the cluster to error state will allow the cluster to be deleted if finalizers were specified

I will try to fix this issue in another PR

JNKPercona commented 1 week ago
Test name Status
arbiter passed
balancer passed
custom-replset-name passed
custom-tls passed
custom-users-roles passed
custom-users-roles-sharded passed
cross-site-sharded passed
data-at-rest-encryption passed
data-sharded passed
demand-backup passed
demand-backup-eks-credentials passed
demand-backup-physical passed
demand-backup-physical-sharded passed
demand-backup-sharded passed
expose-sharded passed
ignore-labels-annotations passed
init-deploy passed
finalizer passed
ldap passed
ldap-tls passed
limits passed
liveness passed
mongod-major-upgrade passed
mongod-major-upgrade-sharded passed
monitoring-2-0 failure
multi-cluster-service passed
non-voting passed
one-pod passed
operator-self-healing-chaos passed
pitr passed
pitr-sharded passed
pitr-physical passed
pvc-resize passed
recover-no-primary passed
replset-overrides passed
rs-shard-migration passed
scaling passed
scheduled-backup passed
security-context passed
self-healing-chaos passed
service-per-pod passed
serviceless-external-nodes passed
smart-update passed
split-horizon passed
storage passed
tls-issue-cert-manager passed
upgrade passed
upgrade-consistency passed
upgrade-consistency-sharded-tls passed
upgrade-sharded passed
users passed
version-service passed
We run 52 out of 52

commit: https://github.com/percona/percona-server-mongodb-operator/pull/1651/commits/37dd5b4a14a84949aa9feac1d4f4e0e762b95ab9 image: perconalab/percona-server-mongodb-operator:PR-1651-37dd5b4a