Open vfrans opened 3 years ago
wal_keep_segments is normally increased in "requested WAL segment ... has already been removed" cases. https://stackoverflow.com/questions/47645487/postgres-streaming-replication-error-requested-wal-segment-has-already-been-rem But I do not know about your restore case / what is root cause there / what should be done there.
Is it possible to mount a shared filesystem and configure wal-e/wal-g to use that filesystem instead of S3?
Hello,
We are running postgres-operator v1.50, deploying postgres image: spilo-12:1.6-p5.
Here is the yaml we use to deploy our cluster (edited for privacy reasons):
We use the following configmap to configure our operator:
When we try to test db failure, it performs really well and fast when the database is not under heavy load.
However, when the database is loaded, we observe such failure in secondary db resync with master:
The problem is pretty clear, the needed WAL has been recycled by the master and cannot be found.
As you might have noticed we are using a custom archive_command to archive our WAL (we don't have access to S3 and we weren't able to configure WAL-E to archive WALs on a file system instead of S3, is there any documentation to support this use case?).
Now we need to also configure the system so that the restore_command to be generated when doing a recovery fetches our WALs from our archive location (and not only on the master).
Is this feasable? We tried to configure the restore command this way:
But for some reasons that doesn't change anything in the generated restore_command.
I realised that this might be a question for Spilo/Patroni but since we are using the operator and we don't know how to pass parameters to them, I ended up posting this here.
Thaks a lot in advance for any insight.
Best regards,
François