zalando / spilo

Highly available elephant herd: HA PostgreSQL cluster using Docker
Apache License 2.0
1.51k stars 372 forks source link

Upgrade fails if postgres install user is not 'postgres' #602

Closed imcitius closed 3 years ago

imcitius commented 3 years ago

Hello

I'm trying to upgrade cluster previously imported from old baremetal servers, where it was run under patroni and created with 'superuser' as install user. During migration to k8s postgres user was created with SUPERUSER priveleges for compatiliblity. But it is not uid 10.

When manually upgrade cluster I can specify pg_upgrade -U superuser, to ensure pg_upgrade init new db exacly as old one. But spilo's /scripts/inplace_upgrade.py script does not do this trick. Is there a way to upgrade cluster created with non-standard install user?

postgres@siam-pgsql-aux-1:/home/postgres$ python3 /scripts/inplace_upgrade.py 3
2021-07-25 21:10:07,307 inplace_upgrade INFO: No PostgreSQL configuration items changed, nothing to reload.
2021-07-25 21:10:07,378 inplace_upgrade WARNING: Kubernetes RBAC doesn't allow GET access to the 'kubernetes' endpoint in the 'default' namespace. Disabling 'bypass_api_service'.
2021-07-25 21:10:07,385 inplace_upgrade INFO: establishing a new patroni connection to the postgres cluster
2021-07-25 21:10:07,583 inplace_upgrade INFO: Cluster siam-pgsql-aux is ready to be upgraded
2021-07-25 21:10:07,585 inplace_upgrade INFO: initdb config: [{'locale': 'en_US.UTF-8'}, {'encoding': 'UTF8'}]
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /home/postgres/pgdata/pgroot/data_new ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /usr/lib/postgresql/13/bin/pg_ctl -D /home/postgres/pgdata/pgroot/data_new -l logfile start

2021-07-25 21:10:08,875 inplace_upgrade INFO: Dropping extensions from the cluster which could be incompatible
2021-07-25 21:10:08,883 inplace_upgrade INFO: Executing "DROP EXTENSION IF EXISTS amcheck_next" in the database="template1"
2021-07-25 21:10:08,891 inplace_upgrade INFO: Executing "DROP EXTENSION IF EXISTS amcheck_next" in the database="api"
2021-07-25 21:10:08,930 inplace_upgrade INFO: Executing "DROP EXTENSION IF EXISTS amcheck_next" in the database="postgres"
2021-07-25 21:10:08,940 inplace_upgrade INFO: Executing pg_upgrade --check
Performing Consistency Checks on Old Live Server
------------------------------------------------
Checking cluster versions                                   ok
Checking database user is the install user
database user "postgres" is not the install user
Failure, exiting
2021-07-25 21:10:09,359 inplace_upgrade ERROR: pg_upgrade --check failed, more details in the /home/postgres/pgdata/pgroot/data_upgrade
CyberDem0n commented 3 years ago

Right. The default bootstrap superuser in Spilo is postgres and we never thought about supporintg scenarios when it has a different name. The pg_upgrade requires that the username that initializing the cluster is the same between the old and new versions, which apparently doesn't match with the postgres. The bootstrap superuser is always created with Oid = 10.

The simplest thing you can do:

  1. find the bootstrap superuser (SELECT rolname FROM pg_roles WHERE oid = 10). Lets assume that its name is patroni.
  2. rename the patroni username to postgres:
    ALTER ROLE patroni WITH LOGIN SUPERUSER;
    ALTER ROLE postgres RENAME TO old_superuser;
    ALTER ROLE patroni RENAME TO postgres;
    ALTER ROLE postgres WITH PASSWORD 'theVerySecretPassword'; /* or use the `\password postgres` in psql */

In order to successfully execute the above commands, you may need to create an intermediate superuser (renaming the current session user isn't possible).

After that major upgrade will work.

imcitius commented 3 years ago

Hello, Alexander, thanks for your prompt reply.

I renamed the install user, but I now faced another problem: pg_upgrade cannot convert some objects during upgrade. May be because of its relation to old_superuser account?

command: "/usr/lib/postgresql/13/bin/pg_dump" --host /home/postgres/pgdata/pgroot/data_upgrade --port 50432 --username postgres --schema-only --quote-all-identifiers --binary-upgrade --format=custom  --file="pg_upgrade_dump_13806.custom" 'dbname=postgres' >> "pg_upgrade_dump_13806.log" 2>&1

command: "/usr/lib/postgresql/13/bin/pg_restore" --host /home/postgres/pgdata/pgroot/data_upgrade --port 50432 --username postgres --clean --create --exit-on-error --verbose --dbname template1 "pg_upgrade_dump_13806.custom" >> "pg_upgrade_dump_13806.log" 2>&1
pg_restore: connecting to database for restore
pg_restore: dropping DATABASE PROPERTIES postgres
pg_restore: dropping DATABASE postgres
pg_restore: creating DATABASE "postgres"
pg_restore: connecting to new database "postgres"
pg_restore: creating COMMENT "DATABASE "postgres""
pg_restore: creating DATABASE PROPERTIES "postgres"
pg_restore: connecting to new database "postgres"
pg_restore: creating pg_largeobject "pg_largeobject"
pg_restore: creating SCHEMA "metric_helpers"
pg_restore: creating SCHEMA "prometheus"
pg_restore: creating SCHEMA "user_management"
pg_restore: creating SCHEMA "zmon_utils"
pg_restore: creating PROCEDURAL LANGUAGE "plperlu"
pg_restore: creating EXTENSION "file_fdw"
pg_restore: creating COMMENT "EXTENSION "file_fdw""
pg_restore: creating EXTENSION "pg_auth_mon"
pg_restore: creating COMMENT "EXTENSION "pg_auth_mon""
pg_restore: creating EXTENSION "set_user"
pg_restore: creating COMMENT "EXTENSION "set_user""
pg_restore: creating TYPE "zmon_utils.system_information"
pg_restore: creating FUNCTION "metric_helpers.get_btree_bloat_approx()"
pg_restore: creating FUNCTION "metric_helpers.get_table_bloat_approx()"
pg_restore: creating FUNCTION "public.file_fdw_handler()"
pg_restore: creating FUNCTION "public.file_fdw_validator("text"[], "oid")"
pg_restore: creating FUNCTION "public.get_pg_stat_activity()"
pg_restore: creating FUNCTION "public.get_pg_stat_replication()"
pg_restore: creating FUNCTION "public.pg_auth_mon()"
pg_restore: creating FUNCTION "public.reset_user()"
pg_restore: creating FUNCTION "public.reset_user("text")"
pg_restore: creating FUNCTION "public.set_user("text")"
pg_restore: creating FUNCTION "public.set_user("text", "text")"
pg_restore: creating FUNCTION "public.set_user_u("text")"
pg_restore: creating FUNCTION "user_management.random_password(integer)"
pg_restore: creating FUNCTION "user_management.terminate_backend(integer)"
pg_restore: creating COMMENT "user_management.FUNCTION "terminate_backend"("pid" integer)"
pg_restore: creating FOREIGN DATA WRAPPER "file_fdw"
pg_restore: creating VIEW "metric_helpers.index_bloat"
pg_restore: creating VIEW "metric_helpers.table_bloat"
pg_restore: creating VIEW "prometheus.pg_stat_activity"
pg_restore: while PROCESSING TOC:

pg_restore: from TOC entry 206; 1259 308989 VIEW pg_stat_activity old_superuser
pg_restore: error: could not execute query: ERROR:  column reference "backend_type" is ambiguous
LINE 33:     "get_pg_stat_activity"."backend_type"
             ^
Command was:
-- For binary upgrade, must preserve pg_type oid
SELECT pg_catalog.binary_upgrade_set_next_pg_type_oid('308991'::pg_catalog.oid);

-- For binary upgrade, must preserve pg_type array oid
SELECT pg_catalog.binary_upgrade_set_next_array_pg_type_oid('308990'::pg_catalog.oid);

-- For binary upgrade, must preserve pg_class oids
SELECT pg_catalog.binary_upgrade_set_next_heap_pg_class_oid('308989'::pg_catalog.oid);

CREATE VIEW "prometheus"."pg_stat_activity" AS
 SELECT "get_pg_stat_activity"."datid",
    "get_pg_stat_activity"."datname",
    "get_pg_stat_activity"."pid",
    "get_pg_stat_activity"."usesysid",
    "get_pg_stat_activity"."usename",
    "get_pg_stat_activity"."application_name",
    "get_pg_stat_activity"."client_addr",
    "get_pg_stat_activity"."client_hostname",
    "get_pg_stat_activity"."client_port",
    "get_pg_stat_activity"."backend_start",
    "get_pg_stat_activity"."xact_start",
    "get_pg_stat_activity"."query_start",
    "get_pg_stat_activity"."state_change",
    "get_pg_stat_activity"."wait_event_type",
    "get_pg_stat_activity"."wait_event",
    "get_pg_stat_activity"."state",
    "get_pg_stat_activity"."backend_xid",
    "get_pg_stat_activity"."backend_xmin",
    "get_pg_stat_activity"."query",
    "get_pg_stat_activity"."backend_type"
   FROM "public"."get_pg_stat_activity"() "get_pg_stat_activity"("datid", "datname", "pid", "usesysid", "usename", "application_name", "client_addr", "client_hostname", "client_port", "backend_start", "xact_start", "query_start", "state_change", "wait_event_type", "wait_event", "state", "backend_xid", "backend_xmin", "query", "backend_type");
imcitius commented 3 years ago

Huh, I found the solution myself, DROP EXTENSION IF EXISTS pg_stat_statements CASCADE and DROP VIEW prometheus.pg_stat_activity before running upgrade.

Thanks.

CyberDem0n commented 3 years ago

There are ways to break pg_upgrade that are barely detectable or automatically solvable. Creating a function that returns a type from pg_catalog that is major version dependant is one of them. The only way to resolve it - drop such objects before a major upgrade.