BCDevOps / backup-container

A simple container for a simple backup strategy.
Apache License 2.0
42 stars 56 forks source link

Webhook notification when server fails to start in timely manner is not clear #21

Open WadeBarnes opened 5 years ago

WadeBarnes commented 5 years ago

When database validation fails due to server startup, the resulting WebHook notification is lacking the details of why the verification failed.

Example notification:

Rocket.Cat @rocket.cat Bot 6:53 AM
Error message received from Ontario's Verifiable Organizations (prod):
Backup verification failed; /backups/daily/2019-04-03/postgresql-TheOrgBook_Database_2019-04-03_09-00-00.sql.gz

Elapsed time: 0h:0m:45s - Status Code: 1

Running the verification from the command line provides the important failure details;

sh-4.2$ ./backup.sh -s -v all

Verifying backup ...

Settings:
- Database: postgresql:5432/TheOrgBook_Database
- Backup file: /backups/daily/2019-04-03/postgresql-TheOrgBook_Database_2019-04-03_09-00-00.sql.gz

waiting for server to start..............................
The server failed to start within 30 seconds.

waiting for server to shut down...... done
server stopped
Cleaning up ...

[!!ERROR!!] - Backup verification failed; /backups/daily/2019-04-03/postgresql-TheOrgBook_Database_2019-04-03_09-00-00.sql.gz

Elapsed time: 0h:0m:45s - Status Code: 1

The Webhook notification should indicate the verification failed because the server did not start in the allotted time.

The amount of time the script waits for the server to start is controlled by the DATABASE_SERVER_TIMEOUT environment variable. The default is 30 seconds. If you find you are receiving these types of errors, you can try increasing the timeout.