rafaelma / pgbackman

PostgreSQL backup manager
https://e-mc2.net/projects/pgbackman/
GNU General Public License v3.0
40 stars 11 forks source link

pgbackman-control.service crash after register_backup_definition #58

Open yerrysherry opened 6 years ago

yerrysherry commented 6 years ago

Hello,

I am using Ubuntu 16/04 and I changed the backup path to /export instead of /srv/pgbackman. By doing this and doing a registration of an new pgsql node (register_backup_definition) the pgbackman-control.service daemon crashed.

Looking at the log file: root@pgbackman:/etc/cron.d# tail -15 /var/log/pgbackman/pgbackman.log 2017-12-17 20:49:39,079 [pgbackman_control][18935][INFO]: UID: 999 and GID: 999 defined for the directory /export/pgsql_node_1/dump 2017-12-17 20:49:39,079 [pgbackman_control][18935][INFO]: UID: 999 abd GID: 999 defined for the directory /export/pgsql_node_1/log 2017-12-17 20:49:39,094 [pgbackman_control][18935][INFO]: Cache file: /export/cache_dir/pgsql_node_2.cache created/updated 2017-12-17 20:49:39,100 [pgbackman_control][18935][WARNING]: Dump directory /srv/pgbackman/pgsql_node_2/dump does not exist 2017-12-17 20:49:39,100 [pgbackman_control][18935][CRITICAL]: OS error when creating the dump directory - [Errno 13] Permission denied: '/srv/pgbackman/pgsql_node_2' 2017-12-17 20:50:01,380 [pgbackman_dump][pg1.yerry.be][gerrit_9_3][18974][INFO]: pgbackman_dump started. 2017-12-17 20:50:01,384 [pgbackman_dump][pg1.yerry.be][gerrit_9_3][18974][INFO]: Backup server: pgbackman.yerry.be is registered in pgbackman 2017-12-17 20:50:01,413 [pgbackman_dump][pg1.yerry.be][gerrit_9_3][18974][CRITICAL]: Database dump file could not be created. Return code = 127. Check log file: /export/pgsql_node_1/log/gerrit_9_3-pg1.yerry.be-v9_3-defid1-cFULL-20171217T205001-DATABASE.log 2017-12-17 20:50:01,421 [pgbackman_dump][pg1.yerry.be][gerrit_9_3][18974][INFO]: Backup job catalog for DefID: 1 or SnapshotID: None updated in the database 2017-12-17 20:53:35,166 [pgbackman_alerts][18014][INFO]: AlertID [1] for BckID [1] registered as sent with smtp_alerts=OFF 2017-12-17 20:55:01,482 [pgbackman_dump][pg1.yerry.be][gerrit_9_3][19093][INFO]: pgbackman_dump started. 2017-12-17 20:55:01,487 [pgbackman_dump][pg1.yerry.be][gerrit_9_3][19093][INFO]: Backup server: pgbackman.yerry.be is registered in pgbackman 2017-12-17 20:55:01,517 [pgbackman_dump][pg1.yerry.be][gerrit_9_3][19093][CRITICAL]: Database dump file could not be created. Return code = 127. Check log file: /export/pgsql_node_1/log/gerrit_9_3-pg1.yerry.be-v9_3-defid1-cFULL-20171217T205501-DATABASE.log 2017-12-17 20:55:01,527 [pgbackman_dump][pg1.yerry.be][gerrit_9_3][19093][INFO]: Backup job catalog for DefID: 1 or SnapshotID: None updated in the database 2017-12-17 20:58:35,284 [pgbackman_alerts][18014][INFO]: AlertID [2] for BckID [2] registered as sent with smtp_alerts=OFF


[pgbackman]$ show_pgsql_node_config 2

NodeID / FQDN: 2


+------------------------------+-----------------------------+-----------------------------------------------------------+ | Parameter | Value | Description | +------------------------------+-----------------------------+-----------------------------------------------------------+ | admin_user | postgres | postgreSQL admin user | | automatic_deletion_retention | 14 days | Retention after automatic deletion of a backup definition | | backup_code | FULL | Backup job code | | backup_day_month_cron | | Backup day_month cron default | | backup_hours_interval | 01-06 | Backup hours interval | | backup_job_status | ACTIVE | Backup job status | | backup_minutes_interval | 01-59 | Backup minutes interval | | backup_month_cron | | Backup month cron default | | backup_weekday_cron | | Backup weekday cron default | | domain | yerry.be | Default domain | | encryption | false | GnuPG encryption - Not used* | | extra_backup_parameters | | Extra backup parameters | | extra_restore_parameters | | Extra restore parameters | | logs_email | example@example.org | E-mail to send logs | | pgnode_backup_partition | /srv/pgbackman/pgsql_node_2 | Partition to save pgbackman information for a pgnode | | pgnode_crontab_file | /etc/cron.d/pgsql_node_2 | Crontab file for pgnode in the backup server | | pgport | 5432 | postgreSQL port | | pgsql_node_status | STOPPED | pgsql node status | | retention_period | 7 days | Retention period for a backup job | | retention_redundancy | 1 | Retention redundancy for a backup job | +------------------------------+-----------------------------+-----------------------------------------------------------+

The problem is pgnode_backup_partition. This value is still looking at the default folder (/srv/pgbackman)


I can change this parameter with: [pgbackman]$ update_pgsql_node_config

NodeID / FQDN []: 2

...

Backup directory [/srv/pgbackman/pgsql_node_2]: /export/pgsql_node_2

Crontab file [/etc/cron.d/pgsql_node_2]:

PgSQL node status [STOPPED]:

...


Then starting the daemon again and everything is working again.

systemctl start pgbackman-control.service root@pgbackman:/etc/cron.d# systemctl status pgbackman-control.service ● pgbackman-control.service - pgbackman control service Loaded: loaded (/lib/systemd/system/pgbackman-control.service; disabled; vendor preset: enabled) Active: active (running) since Sun 2017-12-17 21:03:29 UTC; 6s ago Main PID: 19278 (pgbackman_contr) Tasks: 1 Memory: 8.6M CPU: 60ms CGroup: /system.slice/pgbackman-control.service └─19278 /usr/bin/python /usr/bin/pgbackman_control

So, I think 2 problems

Regards, Gerrit Belgium

rafaelma commented 6 years ago

Hello Thanks for the feedback.

The table backup_server_default_config has also an attributte to define the root backup dir. I have not tested but if you also change the root_backup_dir value of your backup server with the command update_backup_server_config, your update should work.

Anyway, I can see that the implementation we have now is not optimal. The system should update automatically everywhere a new value of the backup partition.

The question is what to do with all the backups already taken and registered in the catalog if you change the backup partition after the system has been using another one. These backups have information about the partition/directory where the backups and logs are saved.

I will look into this issue and document the procedure if you want to change your backup partition in a clean installation and in an installation that has been in production.

regards Rafael Martinez Guerrero