Open gitgithan opened 3 days ago
I propose using pg_restore -c --if-exists -U user -d postgres data.dump
--if-exists caters to the 1st run of pg_restore
From man,
-c
--clean
Before restoring database objects, issue commands to DROP all the objects that will be restored. This option is
useful for overwriting an existing database. If any of the objects do not exist in the destination database,
ignorable error messages will be reported, unless --if-exists is also specified.
From right to left, 3 pg_restore without -c --if-exists, followed by 1 pg_restore with. To show that the sizes go back to the original size instead of duplicating data.
This non-idempotent issue has caused errors that made me second guess whether the restore succeeded. With a -c during pg_restore, these errors should not appear and give users more confidence the restore works.
--------TRUNCATED--------
pg_restore: from TOC entry 3227; 2606 16435 FK CONSTRAINT game_details game_details_game_id_fkey postgres
pg_restore: error: could not execute query: ERROR: constraint "game_details_game_id_fkey" for relation "game_details" already exists
Command was: ALTER TABLE ONLY public.game_details
ADD CONSTRAINT game_details_game_id_fkey FOREIGN KEY (game_id) REFERENCES public.games(game_id);
pg_restore: warning: errors ignored on restore: 16
This idea is more relevant to local setups of postgres.
For docker which uses initialization scripts, the 1st post in this issue cites that docker only runs initialization scripts once, so adding -c --if-exists
in /docker-entrypoint-initdb.d/init-db.sh
seems pointless.
However these 2 options are still useful if users exec into container to restore again if depending on /docker-entrypoint-initdb.d/init-db.sh
fails
Extra investigations
I checked output_file.sql
from pg_restore -U user -f output_file.sql data.dump
and was surprised to see it drops all constraints and tables, so I expect the restore to be idempotent even without -c or --if-exists
. My guess is even though the commands are laid out here, what actually gets run still gets affected by flags used or missing during pg_restore, so I bet these 4 + 9 statements will not run unless -c and --if-exists is used during restore
ALTER TABLE IF EXISTS ONLY public.game_details DROP CONSTRAINT IF EXISTS game_details_game_id_fkey;
ALTER TABLE IF EXISTS ONLY public.player_seasons DROP CONSTRAINT IF EXISTS player_seasons_pkey;
ALTER TABLE IF EXISTS ONLY public.games DROP CONSTRAINT IF EXISTS games_pkey;
ALTER TABLE IF EXISTS ONLY public.actor_films DROP CONSTRAINT IF EXISTS actor_films_pkey;
DROP TABLE IF EXISTS public.users;
DROP TABLE IF EXISTS public.teams;
DROP TABLE IF EXISTS public.player_seasons;
DROP TABLE IF EXISTS public.games;
DROP TABLE IF EXISTS public.game_details;
DROP TABLE IF EXISTS public.events;
DROP TABLE IF EXISTS public.devices;
DROP TABLE IF EXISTS public.arena;
DROP TABLE IF EXISTS public.actor_films;
I noticed running
pg_restore -U user -d postgres data.dump
repeatedly causes some tables to grow in size. I expected tables to be the same no matter how many restores are done.This would be confusing users later if they don't know this can happen. Users may restore multiple times when retrying failed attempts.
The 2 images show 1 restore apart. Top section is local, bottom section is docker.
I'm guessing duplicated data is being inserted after multiple restores.
Is docker non-idempotent behaviour explained at Initialization scripts section of https://hub.docker.com/_/postgres?