canonical / postgresql-k8s-operator

A Charmed Operator for running PostgreSQL on Kubernetes
https://charmhub.io/postgresql-k8s
Apache License 2.0
10 stars 19 forks source link

[Backup/Restore] Unable to restore backup in another cluster #335

Closed javierdelapuente closed 8 months ago

javierdelapuente commented 10 months ago

Steps to reproduce

  1. Backup database to a S3 bucket.

  2. Create a new different cluster. Same controller and model names as in the previous one.

  3. juju deploy postgresql-k8s --trust

  4. juju deploy s3-integrator

  5. juju run s3-integrator/leader sync-s3-credentials access-key= secret-key=

  6. juju config s3-integrator .... (same as in the original cluster)

  7. juju relate s3-integrator postgresql-k8s At this point, with s3 correctly configured, the workload gets into blocked state with message failed to initialize stanza, check your S3 settings

  8. juju run postgresql-k8s/leader list-backups Backups are correctly shown.

  9. juju run postgresql-k8s/leader restore backup-id=2023-11-27T09:37:51Z Fails to restore backups, as the charm does not expect the current block state.

Expected behavior

The backup should be restored in the new cluster.

Actual behavior

Versions

Operating system: Ubuntu 22.04.3 LTS

Juju CLI: 3.1.6-genericlinux-amd64

Juju agent:

Model             Controller  Cloud/Region        Version  SLA          Timestamp
tutorial-synapse  microk8s    microk8s/localhost  3.1.6    unsupported  15:26:31+01:00

App             Version  Status   Scale  Charm           Channel    Rev  Address         Exposed  Message
postgresql-k8s  14.9     waiting      1  postgresql-k8s  14/stable  158  10.152.183.134  no       installing agent
s3-integrator            active       1  s3-integrator   stable      13  10.152.183.70   no       

Unit               Workload  Agent  Address     Ports  Message
postgresql-k8s/0*  blocked   idle   10.1.42.34         failed to initialize stanza, check your S3 settings
s3-integrator/0*   active    idle   10.1.42.35         

Charm revision:

microk8s: MicroK8s v1.27.7 revision 6101

Log output

Juju debug log:

When creating the integration:

unit-postgresql-k8s-0: 15:22:15 INFO unit.postgresql-k8s/0.juju-log s3-parameters:9: Bucket backups-bucket exists.
unit-postgresql-k8s-0: 15:22:16 ERROR unit.postgresql-k8s/0.juju-log s3-parameters:9: non-zero exit code 28 executing ['pgbackrest', '--stanza=tutorial-synapse.patroni-postgresql-k8s', 'stanza-create'], stdout='', stderr='ERROR: [028]: backup and archive info files exist but do not match the database\n       HINT: is this the correct stanza?\n       HINT: did an error occur during stanza-upgrade?\n'
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-postgresql-k8s-0/charm/src/backups.py", line 299, in _initialise_stanza
    self._execute_command(["pgbackrest", f"--stanza={self.stanza_name}", "stanza-create"])
  File "/var/lib/juju/agents/unit-postgresql-k8s-0/charm/src/backups.py", line 212, in _execute_command
    ).wait_output()
  File "/var/lib/juju/agents/unit-postgresql-k8s-0/charm/venv/ops/pebble.py", line 1359, in wait_output
    raise ExecError[AnyStr](self._command, exit_code, out_value, err_value)
ops.pebble.ExecError: non-zero exit code 28 executing ['pgbackrest', '--stanza=tutorial-synapse.patroni-postgresql-k8s', 'stanza-create'], stdout='', stderr='ERROR: [028]: backup and archive info files exist but do not match the database\n       HINT: is this the correct stanza?\n       HINT: did an error occur during stanza-upgrade?\n'
unit-postgresql-k8s-0: 15:22:16 INFO juju.worker.uniter.operation ran "s3-parameters-relation-changed" hook (via hook dispatching script: dispatch)

When trying to restore:

unit-postgresql-k8s-0: 15:25:12 INFO unit.postgresql-k8s/0.juju-log Checking if cluster is in blocked state
unit-postgresql-k8s-0: 15:25:12 ERROR unit.postgresql-k8s/0.juju-log Restore failed: Cluster or unit is in a blocking state

Additional context

Changing the line https://github.com/canonical/postgresql-k8s-operator/blob/f1a470ddabe8ea7acc5e7407c8753c1d2273961a/src/backups.py#L639C11-L639C11 to allow the current message I manage to restore the database successfully.

github-actions[bot] commented 10 months ago

https://warthogs.atlassian.net/browse/DPE-3061

marceloneppel commented 10 months ago

Hi @javierdelapuente! Have you reset the passwords as according the instructions from https://charmhub.io/postgresql-k8s/docs/h-migrate-cluster-via-restore?

javierdelapuente commented 9 months ago

Hi @marceloneppel, yes, the password was reset.

marceloneppel commented 8 months ago

Thanks @javierdelapuente! The fix has been published through revision 185 in the 14/edge channel.