2ndQuadrant / pglogical

Logical Replication extension for PostgreSQL 17, 16, 15, 14, 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
http://2ndquadrant.com/en/resources/pglogical/
Other
1.04k stars 154 forks source link

Can't drop extension after failed initial sync? #422

Open depesz opened 1 year ago

depesz commented 1 year ago

Hi, I was trying to setup pglogical replication from pg12 to pg14 of dataset with ~ 3tb. after two days, we gave up. dropped all stuff related to pglogical form the pg14 side, and removed everything from pg12:

SELECT s.set_name, pglogical.replication_set_remove_table( s.set_name, t.set_reloid) as status, count(*) from pglogical.replication_set s join pglogical.replication_set_table t using (set_id) group by set_name, status order by set_name, count desc;
SELECT s.set_name, pglogical.drop_replication_set( s.set_name) from pglogical.replication_set s;
SELECT pglogical.drop_node('provider');
drop extension if exists pglogical;

Which worked up to the drop extension.

Drop extension breaks, as it deadlocks with anything.

Checked all tables in pglogical schema, there were some rows in .queue table, and only this table, so i truncated this table.

But still can't drop the extension, it deadlocks with anything running on the system, and the deadlock message seems to point to one of the most commonly used tables.

Why does it deadlock? Can I do anything about it? There are no repsets, no repset tables, no nodes, nothing in any pglogical.* table.

depesz commented 1 year ago

Found that there are MANY rows in pg_depend that relate to pglogical extension:

=# select classid::regclass, count(*) from pg_depend where refclassid = 'pg_extension'::regclass and refobjid = (select oid from pg_extension where extname = 'pglogical') group by 1;
  classid   │ count
════════════╪═══════
 pg_proc    │    39
 pg_class   │    12
 pg_trigger │  2638
(3 rows)

All dependencied with pg_class classid were to tables in pglogical schema.

pg_proc - relates to pglogical.* functions.

But the pg_trigger is weird. There point to triggers that I don't see in \d output. For example:

=# \d some_schema.schema_migrations
          Table "some_schema.schema_migrations"
 Column  │          Type          │ Collation │ Nullable │ Default
═════════╪════════════════════════╪═══════════╪══════════╪═════════
 version │ character varying(255) │           │ not null │
Indexes:
    "schema_migrations_pkey" PRIMARY KEY, btree (version), tablespace "data1"
Tablespace: "data1"

=# select t.* from pg_depend d join pg_trigger t on d.objid = t.oid where d.refclassid = 'pg_extension'::regclass and d.refobjid = (select oid from pg_extension where extname = 'pglogical') and d.classid = 'pg_trigger'::regclass and t.tgrelid = 'some_schema.schema_migrations'::regclass \gx
─[ RECORD 1 ]──┬──────────────────────────────────
oid            │ 1625487005
tgrelid        │ 28469
tgname         │ queue_truncate_trigger_1625487005
tgfoid         │ 1625485293
tgtype         │ 32
tgenabled      │ O
tgisinternal   │ t
tgconstrrelid  │ 0
tgconstrindid  │ 0
tgconstraint   │ 0
tgdeferrable   │ f
tginitdeferred │ f
tgnargs        │ 0
tgattr         │
tgargs         │ \x
tgqual         │ [null]
tgoldtable     │ [null]
tgnewtable     │ [null]

pg_dump -s also doesn't show these triggers?!

depesz commented 1 year ago

OK. So, after some thinking, I did:

select format('drop trigger %I ON %s;', tgname, tgrelid::regclass::text) from pg_trigger  where tgname ~ 'queue_truncate_trig' \gexec

and then drop extension pglogical;, and it seems to work fine.

Perhaps it could be added to pglogical to remove obsolete triggers? Or maybe I missed mention in docs about it?