Closed mzwettler2 closed 2 weeks ago
I also find it quite odd that I have to enable backups
Any solutions found?
I mangled the CRD definition and set minIndex: 0
in the relevant section :D
@laurivosandi
What did you do exactly? Hacking the PGO sources?
@cbandy
Maybe this would be an enhancement request?
JFYI: After hacking the CRD-s looks like WAL keeps growing forever :D
So yeah authors should address this issue themselves
Just dropping a line here to note that we have a ticket in our backlog for disabling backups/archiving (for dev envs only, please!)
(This is a feature that has been requested in issues a few times, see also https://github.com/CrunchyData/postgres-operator/issues/2531 https://github.com/CrunchyData/postgres-operator/issues/3152 I'm adding these links to help consolidate this issue in case I close any of these tickets.)
What is the timeline for this feature? Do you have an estimate on when this will be available?
Any update on this?
I've found a way to do this temporarily:
kubectl exec hippo-data-ljc7-0 -c database -it -- bash
vi /pgdata/pg14/postgresql.conf # and set archive_mode = 'off'
pg_ctl restart
It's very useful when my initial datasource is a big pg_dump
file.
After turning archive_mode back on, kicking-off a full backup seems like a good idea:
kubectl create job --from=cronjob/hippo-repo1-full backup-post-import
Just move on to https://cloudnative-pg.io/
@bendilley
I also tried this but it did not work for me. PGO quickly noticed the change and restored the original state with archive_mode='on'. Maybe this might be deferred with the new "pause" flag of PGO 5.2.
Anyway, it is not useful to trick this outside of PGO or Kubernetes.
@mzwettler2 this was with PGO 5.3.0 but I didn't explicitly pause anything.
I do think it's surprisingly difficult to initialise a cluster from a non-PGO datasource. I've even tried restoring from a pgbackrest backup in S3, but because the original source wasn't PGO-managed I still couldn't get it to work as a dataSource
. pg_restore
does work, but a large dataset will fill the disk with transaction logs way faster than they can be streamed by pgbackrest, at least to S3.
@laurivosandi thanks for the link - I'm open to alternatives!
Now that I think about it, I did also merge the below patch of the patroni dynamicConfiguration in an earlier experiment - it didn't appear to have any effect, but it was still in place when I manually turned off archive_mode, which might explain why I got away with it 🤷🏻
spec:
patroni:
dynamicConfiguration:
postgresql:
parameters:
archive_mode: 'off'
@bendilley
At the top of this thread you see that "archive_mode: on" is hardcoded within PGO. It was probably just a coincidence for you.
Perhaps I should ask this in a different thread, but how are you supposed to initialise a PGO database from a non-PGO datasource??
@bendilley
Yes, somewhat off topic here. Anyway, I would go with pg_dump/pg_restore. Seems to be the cleanest way for me.
pg_restore
works fine for smaller datasets, but migrating a large database will fill up the disk with WAL faster than it can be backed-up, at least that's my experience.
@bendilley
Yep. The only "stable" dirty hack I found to prevent for a filesystem full error has been soft-linking a subdirectory of the "/pgbackrest/repo1/archive/db" on the repo pod to "/dev/null". PostgreSQL generates archives and saves them to trash. PGO does not recognize nor undo this filesystem "change".
Any news regarding this issue?
@benjaminjb
We are hopeing and praying for this feature. When will it come?
Further to my earlier observation, I've found this combination of steps to be consistently effective:
kubectl patch postgrescluster hippo --type merge --patch-file patch-patroni-archive-mode.yml
kubectl exec hippo-data-ljc7-0 -c database -it -- bash
vi /pgdata/pg14/postgresql.conf # and set archive_mode = 'off'
pg_ctl restart
exit
where patch-patroni-archive-mode.yml
is
spec:
patroni:
dynamicConfiguration:
postgresql:
parameters:
archive_mode: 'off'
I've only ever done it while pg_restore
ing a dump file. Presumably the setting wouldn't persist if the container was restarted.
After the restore is complete, I just reverse the changes and kick-off a full backup:
kubectl exec hippo-data-ljc7-0 -c database -it -- bash
vi /pgdata/pg14/postgresql.conf # and set archive_mode = 'on'
pg_ctl restart
exit
kubectl patch postgrescluster hippo --type json -p '[{"op": "remove", "path": "/spec/patroni/dynamicConfiguration/postgresql/parameters/archive_mode"}]'
kubectl create job --from=cronjob/hippo-repo1-full backup-post-import
Just dropping a quick update on this thread to note that we are currently taking a look at some upcoming plans and changes to CPK to allow allow additional flexibility around disabling archiving (as requested in this issue).
Therefore, I simply wanted to note that this issue is definitely on our radar, and we will be in touch with additional information, details, etc. once available.
I'm pleased to announce that support for optional backups has been released with Crunchy Postgres for Kubernetes v5.7.0. Please see the docs for more info. If you get a chance to try it out, please let us know what you think!
@dsessler7
Thanks for the feature. I will try it out when I find enough time.
I quickly skimmed the documentation. I would have preferred a simple spec.backups.status switch with a three-way logic.
on: archives + backups (default) off: no archives + backups delete: no archives + backups; delete all existing backup artifacts
I found PGO making archiving mandatory:
This also requires having a repo retention and at least on scheduled backup to get automatic retention management working.
There might be cases where I do not want archiving, eg. loading a large amount of data. There might be even cases where I do not want archiving nor backups, eg. some kind of temporary test systems.
QUESTION: Is there any way to disable archiving?
Environment
Please provide the following details: