reorg / pg_repack

Reorganize tables in PostgreSQL databases with minimal locks
BSD 3-Clause "New" or "Revised" License
1.88k stars 172 forks source link

pg_repack fails with -I Partition_Table #264

Open anyasharf opened 3 years ago

anyasharf commented 3 years ago

pg_repack fails with -I Partition_Table

the ERROR of pg_repack is: ERROR: pg_repack failed with error: SSL SYSCALL error: EOF detected

It happens in RDS dbs Postgres11. and Postgres12. with pg_repack1.4.4 and pg_repakc1.4.5 respectively.

When looking into the postgres log file found the following query that actually amkes the crash of ALL entire connection to the whole instance. And the query is: SELECT t.*, coalesce(v.tablespace, t.tablespace_orig) as tablespace_dest FROM repack.tables t, (VALUES (quote_ident('$1'::text))) as v (tablespace) WHERE (relid = ANY(repack.get_table_and_inheritors('$2'::regclass))) ORDER BY t.relname, t.schemaname;

When we tried to run only this query also without pg_repack via psql and with one of our partition table(30 partitions) we get the same ERROR!.

Actually table "repack.tables" does not include any partitioned table.

dvarrazzo commented 3 years ago

Can you provide a test case to reproduce the issue?

anyasharf commented 3 years ago

-- create partitioned table with unique key

create table a2(brand_id bigint not null, i bigserial, str text not null) PARTITION BY HASH (brand_id); CREATE UNIQUE INDEX a2_idx ON a2 USING btree (brand_id, str, i);

create table a2_0 (brand_id bigint not null, i bigserial, str text not null); create table a2_1 (brand_id bigint not null, i bigserial, str text not null); create table a2_2 (brand_id bigint not null, i bigserial, str text not null); create table a2_3 (brand_id bigint not null, i bigserial, str text not null); create table a2_4 (brand_id bigint not null, i bigserial, str text not null); create table a2_5 (brand_id bigint not null, i bigserial, str text not null); create table a2_6 (brand_id bigint not null, i bigserial, str text not null); create table a2_7 (brand_id bigint not null, i bigserial, str text not null); create table a2_8 (brand_id bigint not null, i bigserial, str text not null); create table a2_9 (brand_id bigint not null, i bigserial, str text not null); create table a2_10(brand_id bigint not null, i bigserial, str text not null); create table a2_11(brand_id bigint not null, i bigserial, str text not null); create table a2_12(brand_id bigint not null, i bigserial, str text not null); create table a2_13(brand_id bigint not null, i bigserial, str text not null); create table a2_14(brand_id bigint not null, i bigserial, str text not null); create table a2_15(brand_id bigint not null, i bigserial, str text not null); create table a2_16(brand_id bigint not null, i bigserial, str text not null); create table a2_17(brand_id bigint not null, i bigserial, str text not null); create table a2_18(brand_id bigint not null, i bigserial, str text not null); create table a2_19(brand_id bigint not null, i bigserial, str text not null); create table a2_20(brand_id bigint not null, i bigserial, str text not null); create table a2_21(brand_id bigint not null, i bigserial, str text not null); create table a2_22(brand_id bigint not null, i bigserial, str text not null); create table a2_23(brand_id bigint not null, i bigserial, str text not null); create table a2_24(brand_id bigint not null, i bigserial, str text not null); create table a2_25(brand_id bigint not null, i bigserial, str text not null); create table a2_26(brand_id bigint not null, i bigserial, str text not null); create table a2_27(brand_id bigint not null, i bigserial, str text not null); create table a2_28(brand_id bigint not null, i bigserial, str text not null); create table a2_29(brand_id bigint not null, i bigserial, str text not null);

ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_0 FOR VALUES WITH (modulus 30, remainder 0); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_1 FOR VALUES WITH (modulus 30, remainder 1); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_2 FOR VALUES WITH (modulus 30, remainder 2 ); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_3 FOR VALUES WITH (modulus 30, remainder 3 ); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_4 FOR VALUES WITH (modulus 30, remainder 4 ); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_5 FOR VALUES WITH (modulus 30, remainder 5 ); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_6 FOR VALUES WITH (modulus 30, remainder 6 ); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_7 FOR VALUES WITH (modulus 30, remainder 7 ); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_8 FOR VALUES WITH (modulus 30, remainder 8 ); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_9 FOR VALUES WITH (modulus 30, remainder 9 ); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_10 FOR VALUES WITH (modulus 30, remainder 10); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_11 FOR VALUES WITH (modulus 30, remainder 11); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_12 FOR VALUES WITH (modulus 30, remainder 12); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_13 FOR VALUES WITH (modulus 30, remainder 13); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_14 FOR VALUES WITH (modulus 30, remainder 14); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_15 FOR VALUES WITH (modulus 30, remainder 15); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_16 FOR VALUES WITH (modulus 30, remainder 16); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_17 FOR VALUES WITH (modulus 30, remainder 17); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_18 FOR VALUES WITH (modulus 30, remainder 18); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_19 FOR VALUES WITH (modulus 30, remainder 19); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_20 FOR VALUES WITH (modulus 30, remainder 20); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_21 FOR VALUES WITH (modulus 30, remainder 21); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_22 FOR VALUES WITH (modulus 30, remainder 22); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_23 FOR VALUES WITH (modulus 30, remainder 23); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_24 FOR VALUES WITH (modulus 30, remainder 24); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_25 FOR VALUES WITH (modulus 30, remainder 25); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_26 FOR VALUES WITH (modulus 30, remainder 26); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_27 FOR VALUES WITH (modulus 30, remainder 27); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_28 FOR VALUES WITH (modulus 30, remainder 28); ALTER TABLE ONLY public.a2 ATTACH PARTITION public.a2_29 FOR VALUES WITH (modulus 30, remainder 29);

-- insert few rows

INSERT INTO a2(brand_id, str) SELECT i, md5(random()::text||random()::text||random()::text) FROM generate_series(1, 1000000) AS t(i);

-- run pg_repack pg_repack -k -h -D -U usr -d DBBB -I a2 -n -j 1 ERROR: pg_repack failed with error: SSL SYSCALL error: EOF detected

-- goto log of postgres instance SELECT t.*, coalesce(v.tablespace, t.tablespace_orig) as tablespace_dest FROM repack.tables t, (VALUES (quote_ident('$1'::text))) as v (tablespace) WHERE (relid = ANY(repack.get_table_and_inheritors('$2'::regclass))) ORDER BY t.relname, t.schemaname;

-- running this query with $1 and $2 ='a2' that hopefully correct: SELECT t.*, coalesce(v.tablespace, t.tablespace_orig) as tablespace_dest FROM repack.tables t, (VALUES (quote_ident('a2'::text))) as v (tablespace) WHERE (relid = ANY(repack.get_table_and_inheritors('a2'::regclass))) ORDER BY t.relname, t.schemaname;

and the postgres log looks like: 2021-02-23 19:39:08 UTC::@:[13827]:LOG: server process (PID 18972) was terminated by signal 11: Segmentation fault 2021-02-23 19:39:08 UTC::@:[13827]:DETAIL: Failed process was running: SELECT t.*, coalesce(v.tablespace, t.tablespace_orig) as tablespace_dest FROM repack.tables t, (VALUES (quote_ident($1::text))) as v (tablespace) WHERE (relid = ANY(repack.get_table_and_inheritors($2::regclass))) ORDER BY t.relname, t.schemaname 2021-02-23 19:39:08 UTC::@:[13827]:LOG: terminating any other active server processes 2021-02-23 19:39:08 UTC:1.1.1.9(56464):usr@DBB:[18973]:WARNING: terminating connection because of crash of another server process 2021-02-23 19:39:08 UTC:1.1.1.9(56464):usr@DBB:[18973]:DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2021-02-23 19:39:08 UTC:1.1.1.9(56464):usr@DBB:[18973]:HINT: In a moment you should be able to reconnect to the database and repeat your command. 2021-02-23 19:39:08 UTC::@:[12335]:WARNING: terminating connection because of crash of another server process 2021-02-23 19:39:08 UTC::@:[12335]:DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2021-02-23 19:39:08 UTC::@:[12335]:HINT: In a moment you should be able to reconnect to the database and repeat your command. 2021-02-23 19:39:08 UTC::@:[13827]:LOG: archiver process (PID 12336) exited with exit code 1 2021-02-23 19:39:08 UTC::@:[13827]:LOG: all server processes terminated; reinitializing 2021-02-23 19:39:09 UTC::@:[18974]:LOG: database system was interrupted; last known up at 2021-02-23 19:35:08 UTC 2021-02-23 19:39:09 UTC::@:[18974]:LOG: database system was not properly shut down; automatic recovery in progress


thank you Anya

anyasharf commented 3 years ago

Can you provide a test case to reproduce the issue?

thanks