BCDevOps / backup-container

A simple container for a simple backup strategy.
Apache License 2.0
39 stars 56 forks source link

PostgreSQL backup verification is failing in v2.8.1 and v2.9.0 #132

Open WalterMoar opened 1 month ago

WalterMoar commented 1 month ago

When upgrading to v2.9.0 our PostgreSQL 13.13 backup verifications are failing with the error:

ERROR: role "postgres" already exists

When looking into the problem it was found that everything works fine with v2.8.0, but v2.8.1 and v2.9.0 fail in this same way.

The backups up to and including v2.8.0 have the PostgreSQL database dump at the top of the backup file, followed by the PostgreSQL database cluster dump that is:

--
-- PostgreSQL database cluster dump
--

SET default_transaction_read_only = off;

SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;

--
-- Roles
--

CREATE ROLE postgres;
ALTER ROLE postgres WITH SUPERUSER INHERIT CREATEROLE CREATEDB LOGIN REPLICATION BYPASSRLS;
CREATE ROLE "userRLX";
ALTER ROLE "userRLX" WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB LOGIN NOREPLICATION NOBYPASSRLS;

--
-- PostgreSQL database cluster dump complete
--

However, in v2.8.1 and v2.9.0, the order of these two sections is reversed, with the PostgreSQL database cluster dump containing the CREATE ROLE postgres coming before the PostgreSQL database dump. It looks like the order was switched in commit https://github.com/BCDevOps/backup-container/commit/eda584f7ca63ed668e1db1082782f671eadb77ee.

The logs for the failure look like:

Restoring '/backups/daily/2024-08-14/chefs-redash-postgresql-redash_2024-08-14_12-16-23.sql.gz' to '127.0.0.1/redash' ...

DROP DATABASE "redash";
DROP DATABASE
CREATE DATABASE "redash";
CREATE DATABASE
DROP SCHEMA IF EXISTS metric_helpers CASCADE;
NOTICE: schema "metric_helpers" does not exist, skipping
DROP SCHEMA
DROP SCHEMA IF EXISTS user_management CASCADE;
NOTICE: schema "user_management" does not exist, skipping
DROP SCHEMA
GRANT ALL ON DATABASE "redash" TO "userXXX";
GRANT
SET
SET
SET
ERROR: role "postgres" already exists
Restore failed.

To complicate things, the same error does occur in v2.8.0, even though the verification is marked as a success:

Restoring '/backups/daily/2024-08-14/chefs-redash-postgresql-redash_2024-08-14_12-01-16.sql.gz' to '127.0.0.1/redash' ...
DROP DATABASE "redash";
DROP DATABASE
CREATE DATABASE "redash";
CREATE DATABASE
GRANT ALL ON DATABASE "redash" TO "userRLX";
GRANT
backup.sql
roles.sql
SET
SET
SET
ERROR: role "postgres" already exists
SET
SET
[...carries on without failing...]

So if it was continuing with the verification even if there were errors (which is bad) is it possible that commit https://github.com/BCDevOps/backup-container/commit/7e4650498c0432469e277c1b404630a57b9713a6 didn't get included in the build until v2.8.1?