spantaleev / matrix-docker-ansible-deploy

🐳 Matrix (An open network for secure, decentralized communication) server setup using Ansible and Docker
GNU Affero General Public License v3.0
4.76k stars 1.03k forks source link

how does the automated docker-postgres-backup exactly work? (bug?) #1227

Open janonym1 opened 3 years ago

janonym1 commented 3 years ago

Hi folks!

I already talked about my problem with the automated backup system that comes with this playbook in the matrix room #matrix-docker-ansible-deploy:devture.com but maybe this will help some other people in the future, when they encounter it as well.

I wanted to import an older (local) backup made by the docker-backup-system , but this is where I bumped into my first problem. The tool creates multiple, seperate files in my backup folder like this:

/matrix/postgres-backup/weekly
-rw-r--r-- 1 matrix matrix 1.7K Jul 11 02:01 matrix_appservice_irc-202127.sql.gz
-rw-r--r-- 1 matrix matrix 1.7K Jul 18 02:01 matrix_appservice_irc-202128.sql.gz
-rw-r--r-- 1 matrix matrix 1.7K Jul 25 02:00 matrix_appservice_irc-202129.sql.gz
-rw-r--r-- 2 matrix matrix 1.7K Aug  1 02:00 matrix_appservice_irc-202130.sql.gz
-rw-r--r-- 3 matrix matrix 1.7K Aug  5 02:00 matrix_appservice_irc-202131.sql.gz
-rw-r--r-- 1 matrix matrix  12K Jul 18 02:01 matrix_dimension-202128.sql.gz
-rw-r--r-- 1 matrix matrix  12K Jul 25 02:01 matrix_dimension-202129.sql.gz
-rw-r--r-- 1 matrix matrix  12K Aug  1 02:00 matrix_dimension-202130.sql.gz
-rw-r--r-- 2 matrix matrix  12K Aug  8 02:00 matrix_dimension-202131.sql.gz
-rw-r--r-- 3 matrix matrix  12K Aug 11 02:00 matrix_dimension-202132.sql.gz
-rw-r--r-- 1 matrix matrix 8.4K Jul 18 02:01 matrix_ma1sd-202128.sql.gz
-rw-r--r-- 1 matrix matrix 8.6K Jul 25 02:00 matrix_ma1sd-202129.sql.gz
-rw-r--r-- 1 matrix matrix 9.6K Aug  1 02:00 matrix_ma1sd-202130.sql.gz
-rw-r--r-- 2 matrix matrix  11K Aug  8 02:00 matrix_ma1sd-202131.sql.gz
-rw-r--r-- 3 matrix matrix  11K Aug 11 02:00 matrix_ma1sd-202132.sql.gz
-rw-r--r-- 1 matrix matrix 2.5K Jul 11 02:01 matrix_mautrix_signal-202127.sql.gz
-rw-r--r-- 1 matrix matrix 2.5K Jul 18 02:01 matrix_mautrix_signal-202128.sql.gz
-rw-r--r-- 1 matrix matrix 2.5K Jul 25 02:00 matrix_mautrix_signal-202129.sql.gz
-rw-r--r-- 2 matrix matrix 2.5K Aug  1 02:00 matrix_mautrix_signal-202130.sql.gz
-rw-r--r-- 3 matrix matrix 2.5K Aug  5 02:00 matrix_mautrix_signal-202131.sql.gz
-rw-r--r-- 3 matrix matrix 1.5M Apr 30 02:03 matrix_mautrix_telegram-202117.sql.gz
-rw-r--r-- 1 matrix matrix  777 Jul 11 02:01 matrix_reminder_bot-202127.sql.gz
-rw-r--r-- 1 matrix matrix  777 Jul 18 02:01 matrix_reminder_bot-202128.sql.gz
-rw-r--r-- 1 matrix matrix  777 Jul 25 02:00 matrix_reminder_bot-202129.sql.gz
-rw-r--r-- 2 matrix matrix  777 Aug  1 02:00 matrix_reminder_bot-202130.sql.gz
-rw-r--r-- 3 matrix matrix  777 Aug  5 02:00 matrix_reminder_bot-202131.sql.gz
-rw-r--r-- 1 matrix matrix 284M Jul 18 02:01 synapse-202128.sql.gz
-rw-r--r-- 1 matrix matrix 271M Jul 25 02:00 synapse-202129.sql.gz
-rw-r--r-- 1 matrix matrix 265M Aug  1 02:00 synapse-202130.sql.gz
-rw-r--r-- 2 matrix matrix 266M Aug  8 02:00 synapse-202131.sql.gz
-rw-r--r-- 3 matrix matrix 266M Aug 11 02:00 synapse-202132.sql.gz

So, I thought I just need import the files according to a chosen week one after another. I imported them like described on importing-postgres-page

That is where I ran into my second problem: the importing process stops (depending on which file I start with): "'ERROR: schema "public" already exists'

Slavi recommended, that I do the following to recreate a clean postgres-DB before importing:

that helped with importing the first file but as soon as import another one (eg. synapse first, then ma1sd) I still get a lot of 'ERROR: schema "public" already exists'.

The solution to this was to extract the gz-files and manually edit all the CREATE TABLE lines out, whenever an the "already exists" error came up. So finally, after manually extracting and editing the synapse, matrix_ma1sd, matrix_reminder_bot, matrix_mautrix_signal, matrix_dimension and matrix_appservice_irc I could at least import them successfully.

However, when I just ran a manual backup with:

/usr/bin/docker exec \
--env-file=/matrix/postgres/env-postgres-psql \
matrix-postgres \
/usr/local/bin/pg_dumpall -h matrix-postgres \
| gzip -c \
> /matrix/postgres-backup/manually/backup/date_today_postgres.sql.gz

I could easily import this (single) file normally and it worked as intended.

The process with multiple backup-files and the extracting and editing seem bothersome and not intended, so maybe I am overlooking something obvious?

wol-win commented 3 years ago

I was always wondering why the backup is separated out into different files. But if indeed a replay is not possible without modifying the files, it is indeed worth considering changing the playbook to do a pg_dumpall rather individual parts of the database.

janonym1 commented 3 years ago

Unfortunately, I never got around to test the backup system beforehand until I needed it of course! ;) Since the sizes of the files were about the same as my manual pg_dumpall backups, I assumed it should work as well.

I can imagine the advantages of separating the different DBs into different files, because maybe you only need the ma1sd data/users for something and not the whole thing.

I see basically 3 possibilites: -) I am doing something wrong (maybe there is an certain order to importing the backup files or I am overlooking something) -) I ran some breaking changes over the 1-2 months, which made the import over a certain timespan buggy, however I dont recall big postgreSQL updates/upgrades (I only ever used this playbook for most things) -) the automated backup system or the importing is over-redundant and buggy

For now, my backup is back to my good old pg_dumpall every few days. I was wondering if other people are having the same troubles

skepticalwaves commented 3 years ago

The container which does the backups has some documentation on how to restore: https://github.com/prodrigestivill/docker-postgres-backup-local#restore-examples

janonym1 commented 2 years ago

The container which does the backups has some documentation on how to restore: https://github.com/prodrigestivill/docker-postgres-backup-local#restore-examples

We moved on to a dedicated PostgreSQL-DB server but if I find the time, I will test and retrace the problem on my testsystem (if I can recreate it with the latest updates)

ptman commented 2 years ago

https://github.com/spantaleev/matrix-docker-ansible-deploy/blob/master/docs/configuring-playbook-postgres-backup.md would be great to have restore guide, but ultimately that can be found as pointed out above.

foxcris commented 2 years ago

I think having seperated files for each db is better then having one big file. Especially if you want to restore only a specific db. To restore just follow https://github.com/prodrigestivill/docker-postgres-backup-local#restore-examples and delete the specific database before from the running instance.