Follower/secondary replica out of storage space after restore

cdekkers commented 1 year ago

Overview

When attempting to perform an in-place point-in-time-recovery with two replicas, the restore process succeeds and the leader replica Pod start running successfully, while the follower starts consuming so much storage space that it runs out.

Storage consumption metrics

Previously, the two instances were only consuming between 300M and 500M. The "culprit" appears to be the pg14_wal folder, consuming almost all of the 10G of allocated storage space.

Environment

Platform: OpenShift
Platform Version: 4.11.43
PGO Image Tag: 5.4.2
Postgres Version: 14
Storage: standard-csi (cinder.csi.openstack.org)

Steps to Reproduce

REPRO

Follow the instructions for performing an in-place point-in-time-recovery. In our case, we have configured our instances as follows:

instances:
  - dataVolumeClaimSpec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
      storageClassName: standard-csi
    name: instance1
    replicas: 2
    topologySpreadConstraints:
      - labelSelector:
          matchLabels:
            postgres-operator.crunchydata.com/instance-set: instance1
        maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: DoNotSchedule

After annotating the PostgresCluster, which will trigger the restore, the two running instances will terminate and a restore Pod will launch and start the restore process.
Once this process has completed, two new instances will initialize: one leader, which will successfully initialize and start running, and a follower, which will attempt to do the same.

EXPECTED

Both the leader and the follower will initialize and start running without any errors.

ACTUAL

The leader initializes and starts running, but the follower fails to perform pg_rewind successfully, but keeps on trying.

Logs

These logs will continue indefinitely (indicated by (...)) because the process keeps restarting.

Click here to view the logs (collapsed)

``` 2023-09-18 08:58:42,969 INFO: No PostgreSQL configuration items changed, nothing to reload. 2023-09-18 08:58:43,001 WARNING: Postgresql is not running. 2023-09-18 08:58:43,001 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:58:43,010 INFO: pg_controldata: pg_control version number: 1300 Catalog version number: 202107181 Database system identifier: 7178837296953745502 Database cluster state: shut down in recovery pg_control last modified: Mon Sep 18 08:54:16 2023 Latest checkpoint location: 58A/14000028 Latest checkpoint's REDO location: 58A/14000028 Latest checkpoint's REDO WAL file: 0000009A0000058A00000014 Latest checkpoint's TimeLineID: 154 Latest checkpoint's PrevTimeLineID: 154 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0:23271724 Latest checkpoint's NextOID: 1108065 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID: 726 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 0 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Latest checkpoint's oldestCommitTsXid: 0 Latest checkpoint's newestCommitTsXid: 0 Time of latest checkpoint: Mon Sep 18 08:54:14 2023 Fake LSN counter for unlogged rels: 0/3E8 Minimum recovery ending location: 58A/140000A0 Min recovery ending loc's timeline: 154 Backup start location: 0/0 Backup end location: 0/0 End-of-backup record required: no wal_level setting: logical wal_log_hints setting: on max_connections setting: 100 max_worker_processes setting: 8 max_wal_senders setting: 10 max_prepared_xacts setting: 0 max_locks_per_xact setting: 64 track_commit_timestamp setting: off Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Size of a large-object chunk: 2048 Date/time type storage: 64-bit integers Float8 argument passing: by value Data page checksum version: 1 Mock authentication nonce: cd3388d7c10174c16549ffdbd6f4df37d8dcd75aebe03a32f92859f213cd19b0 2023-09-18 08:58:43,014 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:58:43,037 INFO: Local timeline=154 lsn=58A/140000A0 2023-09-18 08:58:43,065 INFO: primary_timeline=155 2023-09-18 08:58:43,067 INFO: primary: history=151 589/F40000A0 no recovery target specified 152 58A/A0 no recovery target specified 153 58A/50000A0 no recovery target specified 154 584/B400FA00 before 2023-09-17 10:00:00.003336+00 2023-09-18 08:58:43,256 INFO: running pg_rewind from jira-db-instance1-7dnb-0 2023-09-18 08:58:43,318 INFO: running pg_rewind from dbname=postgres user=_crunchyrepl host=jira-db-instance1-7dnb-0.jira-db-pods port=5432 sslmode=verify-ca sslcert=/tmp/replication/tls.crt sslkey=/tmp/replication/tls.key sslrootcert=/tmp/replication/ca.crt target_session_attrs=read-write 2023-09-18 08:58:47,338 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:58:47,338 INFO: running pg_rewind from jira-db-instance1-7dnb-0 in progress 2023-09-18 08:58:57,338 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:58:57,338 INFO: running pg_rewind from jira-db-instance1-7dnb-0 in progress 2023-09-18 08:59:07,323 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:59:07,323 INFO: running pg_rewind from jira-db-instance1-7dnb-0 in progress 2023-09-18 08:59:17,323 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:59:17,324 INFO: running pg_rewind from jira-db-instance1-7dnb-0 in progress 2023-09-18 08:59:17,332 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:59:17,333 INFO: running pg_rewind from jira-db-instance1-7dnb-0 in progress 2023-09-18 08:59:27,300 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:59:27,301 INFO: running pg_rewind from jira-db-instance1-7dnb-0 in progress 2023-09-18 08:59:37,301 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:59:37,301 INFO: running pg_rewind from jira-db-instance1-7dnb-0 in progress 2023-09-18 08:59:37,334 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:59:37,334 INFO: running pg_rewind from jira-db-instance1-7dnb-0 in progress 2023-09-18 08:59:47,323 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:59:47,323 INFO: running pg_rewind from jira-db-instance1-7dnb-0 in progress 2023-09-18 08:59:52,939 INFO: pg_rewind exit code=1 2023-09-18 08:59:52,939 INFO: stdout= 2023-09-18 08:59:52,939 INFO: stderr=pg_rewind: servers diverged at WAL location 584/B400FA00 on timeline 154 pg_rewind: rewinding from last common checkpoint at 584/AF00E130 on timeline 151 ERROR: [042]: unable to get 00000097000005870000000A: repo1: 14-1/0000009700000587/00000097000005870000000A-9e5b9b16e63c5f3e11aa9f84a9d3c2c1d7b3c7d7.gz [FileWriteError] unable to write '/pgdata/pg14/pg_wal/00000097000005870000000A' pg_rewind: error: could not restore file "00000097000005870000000A" from archive pg_rewind: fatal: could not read WAL record at 587/A000000 2023-09-18 08:59:52,955 ERROR: Failed to rewind from healty primary: jira-db-instance1-7dnb-0 2023-09-18 08:59:52,995 WARNING: Postgresql is not running. 2023-09-18 08:59:52,995 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:59:53,001 INFO: pg_controldata: pg_control version number: 1300 Catalog version number: 202107181 Database system identifier: 7178837296953745502 Database cluster state: shut down in recovery pg_control last modified: Mon Sep 18 08:54:16 2023 Latest checkpoint location: 58A/14000028 Latest checkpoint's REDO location: 58A/14000028 Latest checkpoint's REDO WAL file: 0000009A0000058A00000014 Latest checkpoint's TimeLineID: 154 Latest checkpoint's PrevTimeLineID: 154 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0:23271724 Latest checkpoint's NextOID: 1108065 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID: 726 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 0 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Latest checkpoint's oldestCommitTsXid: 0 Latest checkpoint's newestCommitTsXid: 0 Time of latest checkpoint: Mon Sep 18 08:54:14 2023 Fake LSN counter for unlogged rels: 0/3E8 Minimum recovery ending location: 58A/140000A0 Min recovery ending loc's timeline: 154 Backup start location: 0/0 Backup end location: 0/0 End-of-backup record required: no wal_level setting: logical wal_log_hints setting: on max_connections setting: 100 max_worker_processes setting: 8 max_wal_senders setting: 10 max_prepared_xacts setting: 0 max_locks_per_xact setting: 64 track_commit_timestamp setting: off Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Size of a large-object chunk: 2048 Date/time type storage: 64-bit integers Float8 argument passing: by value Data page checksum version: 1 Mock authentication nonce: cd3388d7c10174c16549ffdbd6f4df37d8dcd75aebe03a32f92859f213cd19b0 2023-09-18 08:59:53,002 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:59:53,004 INFO: starting as a secondary 2023-09-18 08:59:54.323 UTC [816] FATAL: could not write lock file "postmaster.pid": No space left on device 2023-09-18 08:59:54,336 INFO: postmaster pid=816 /tmp/postgres:5432 - no response 2023-09-18 08:59:57,314 INFO: Lock owner: jira-db-instance1-7dnb-0; I am jira-db-instance1-t9cz-0 2023-09-18 08:59:57,417 INFO: failed to start postgres (...) ```

dsessler7 commented 1 year ago

Hey @cdekkers, sorry you're having problems. I'm wondering if you can get a few things for me:

PGO logs
A copy of your PostgresCluster manifest
Output from pgbackrest info (you will need to exec into the replica pod to run the command)

jmckulk commented 11 months ago

Since we haven't gotten an update on this issue in a while, I'm going to close it. If you still need help and can provide the requested information, feel free to re-open.

CrunchyData / postgres-operator