percona / percona-postgresql-operator

Percona Operator for PostgreSQL
https://www.percona.com/doc/kubernetes-operator-for-postgresql/index.html
Apache License 2.0
254 stars 50 forks source link

K8SPG-518: Add latest restorable time field to backups #759

Closed egegunes closed 1 month ago

egegunes commented 2 months ago

K8SPG-518 Powered by Pull Request Badge

CHANGE DESCRIPTION

These changes add a new field called latestRestorableTime to PerconaPGBackup objects.

Since PG Operator doesn't reconcile objects if they're not changed, we needed introduce a new mechanism to trigger reconciliation: We create and store a channel in both PGClusterReconciler and PGBackupReconciler and send events to it to trigger reconciliation of PerconaPGBackup objects.

When a cluster is reconciled for the first time, operator starts a separate goroutine which continously fetches commit timestamps from latest uploaded WAL file. If the timestamp differs from status.latestRestorableTime field of the last backup, it sends an event to the channel and since PGBackupReconciler watches this channel, it reconciles the backup.

When a cluster is deleted, operator stops the goroutine by sending a DeleteEvent to an additional channel called StopExternalWatchers. For this we introduce a new finalizer called percona.com/stop-watchers. That finalizer ensures the watcher goroutine is stopped and watcher is removed from the watcher registry. Users don't need to add this finalizer to their CRs. Operator will automatically patch the CR if this finalizer doesn't exist in metadata.finalizers.

CHECKLIST

Jira

Tests

Config/Logging/Testability

hors commented 1 month ago

I have created two clusters under one operator and see the following:

2024-05-27T13:44:23.438Z  INFO    Triggering PGBackup reconcile   {"controller": "perconapgcluster", "controllerGroup": "pgv2.percona.com", "controllerKind": "PerconaPGCluster", "PerconaPGCluster": {"name":"cluster1","namespace":"kuttl-test-free-seal"}, "namespace": "kuttl-test-free-seal", "name": "cluster1", "reconcileID": "4db25166-49db-4e36-bb48-ff25de749dfa", "latestBackup": "backupr42", "latestRestorableTimeError": "PANIC=runtime error: invalid memory address or nil pointer dereference", "latestCommitTimestamp": "2024-05-27 13:43:47 +0000 UTC"}
2024-05-27T13:44:23.552Z  INFO    Got latest restorable timestamp {"controller": "perconapgbackup", "controllerGroup": "pgv2.percona.com", "controllerKind": "PerconaPGBackup", "PerconaPGBackup": {"name":"backupr42","namespace":"kuttl-test-free-seal"}, "namespace": "kuttl-test-free-seal", "name": "backupr42", "reconcileID": "296e46aa-3a03-4684-9b06-b98abc19b8b6", "request": {"name":"backupr42","namespace":"kuttl-test-free-seal"}, "timestamp": "2024-05-27 13:43:47 +0000 UTC"}
2024-05-27T13:44:23.836Z  INFO    Got latest restorable timestamp {"controller": "perconapgbackup", "controllerGroup": "pgv2.percona.com", "controllerKind": "PerconaPGBackup", "PerconaPGBackup": {"name":"backupr42","namespace":"kuttl-test-free-seal"}, "namespace": "kuttl-test-free-seal", "name": "backupr42", "reconcileID": "3c3c9f7b-e7ad-4752-9300-300f4849df77", "request": {"name":"backupr42","namespace":"kuttl-test-free-seal"}, "timestamp": "2024-05-27 13:43:47 +0000 UTC"}

You can see it when operator tries "Triggering PGBackup reconcile" for two DBs at the same time

JNKPercona commented 1 month ago
Test name Status
custom-extensions passed
demand-backup passed
init-deploy passed
monitoring passed
operator-self-healing passed
scaling passed
scheduled-backup passed
self-healing passed
start-from-backup passed
tablespaces passed
telemetry-transfer passed
upgrade-minor passed
users passed
pitr passed
We run 14 out of 14

commit: https://github.com/percona/percona-postgresql-operator/pull/759/commits/aaa0931915fa2b71a41cb54648d2bf80789f9164 image: perconalab/percona-postgresql-operator:PR-759-aaa093191