Open arajkumar opened 1 month ago
To add, when CREATE MATERIALIZED VIEW ...
is executed, pgcopydb adds the following lines to the .sql
WAL file that should be ignored (from the JSON file during transformation)
BEGIN; -- {"xid":26398752,"lsn":"7B/290E4478","timestamp":"2024-05-27 06:58:35.824233+0000","commit_lsn":"7B/29103F18"}
PREPARE bcb53845 AS INSERT INTO "public"."truck_data_id_0" ("time", "vehicle_speed") overriding system value VALUES ($1, $2), ($3, $4), ...
EXECUTE bcb53845["2012-01-04 11:20:00+00",null,"2012-01-03 18:40:00+00",null,"2012-01-03 02:00:00+00",null,"2012-01-02 09:20:00+00",null,"2012-01-01 16:40:00+00",null,...
COMMIT; -- {"xid":26398752,"lsn":"7B/29103F18","timestamp":"2024-05-27 06:58:35.824233+0000"}
PostgreSQL logical decoding does not support DDL. This is documented both in PostgreSQL and in pgcopydb. CREATE and DROP keywords introduce DDL commands.
@dimitri, Sorry if I'm not clear, the problem is not propagation of DDL to target, but doing drop/recreation of matview causes INSERT DML message on the matview which is causing failure during CDC(follow).
May be this could be a bug in Postgres, but I believe we could address this on pgcopydb.
It seems to me that using the pgoutput plugin with the PUBLICATION object, where we attach tables individually, will help solve that problem. Maybe we will just skip adding matviews in the publication.
Now, with that being said, we could skip DML that target matview entirely in our part of the code, as we know for sure we can't replay then. This is another case where we need to make a decision depending on the schema on the target.
I managed to reproduce with REFRESH MATERIALIZED VIEW CONCURRENTLY
, without doing any DDL on the matview.
CREATE TABLE IF NOT EXISTS "metrics"( id BIGINT, "time" timestamp with time zone NOT NULL,
name TEXT NOT NULL,
value NUMERIC NOT NULL
);
insert into metrics values (21, now() + '4 days'::interval, 'hello', 1), (20, now() + '4 days'::interval, 'hello', 1);
create materialized view if not exists metrics_count AS SELECT id, count(*), min(value), max(value), avg(value) from metrics group by 1;
create unique index metrics_count_uniq on metrics_count using btree(id);
insert into metrics values (21, now() + '3 days'::interval, 'hello', 1), (20, now() + '5 days'::interval, 'hello', 1);
refresh materialized view concurrently metrics_count ;
$ pg_recvlogical -d "$PGCOPYDB_SOURCE_PGURI" --slot=pgcopydb --start --plugin=test_decoding --file=- --create-slot
BEGIN 86005087
table public.metrics_count: DELETE: (no-tuple-data)
table public.metrics_count: INSERT: id[bigint]:21 count[bigint]:2 min[numeric]:1 max[numeric]:1 avg[numeric]:1.00000000000000000000
COMMIT 86005087
@dimitri Do you think the above scenario is a valid usecase and need to be addressed in pgcopydb?
@dimitri Do you think the above scenario is a valid usecase and need to be addressed in pgcopydb?
Yes it is, the first one too: both are DDLs. We can't support CREATE/DROP replication, but in hindsight it's fair that we could choose to ignore DML that target a MATVIEW on the target, maybe with a WARNING message (per transaction).
Step to reproduce the problem
Create matview on source
Launch pgcopydb with follow
Recreate matview (drop/create)
pgcopydb apply exits with following error message,
I tried with pg_recvlogical and found that both logical decoding plugins(wal2json, test_decoding) responds to matview recreation.
IIUC, we should filter-out DML messages if the relation not a table.