dimitri / pgcopydb

Copy a Postgres database to a target Postgres server (pg_dump | pg_restore on steroids)
Other
1.17k stars 77 forks source link

pgcopydb v0.14: Filtering doesn't work! ERRORS with pgcopydb clone --follow --plugin wal2json --filters #624

Open branecc opened 9 months ago

branecc commented 9 months ago

Filtering doesn't work!

Environment:

Source pg11 (database pgbench, used pgbench tool), Rocky linux 8 Target pg15, Rocky linux 9

$ pgcopydb --version
10:53:56 520220 INFO   Running pgcopydb version 0.14 from "/usr/pgsql-15/bin/pgcopydb"
pgcopydb version 0.14
compiled with PostgreSQL 15.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat 11.4.1-2), 64-bit
compatible with Postgres 11, 12, 13, 14, 15, and 16

pgcopydb compiled from source https://github.com/dimitri/pgcopydb/archive/refs/tags/v0.14.tar.gz

Executed commands:

$ export PGCOPYDB_SOURCE_PGURI="postgres://repuser@drtest-pg1/pgbench"
$ export PGCOPYDB_TARGET_PGURI="postgres://repuser@pg4-test/pgbench"
$ export PGCOPYDB_DIR=/transfer/pg11/pgbench
$ export PGCOPYDB_LOG_DIR=/transfer/pg11/logs

$ cat ~/pgcopydb/scripts/pgcopydb.pgbench.filter
[exclude-table]
public.table_no_pk

$ pgcopydb clone \
   --follow \
   --dir "$PGCOPYDB_DIR" \
   --fail-fast \
   --skip-extensions \
   --slot-name pgcopydb_rgs_wal2json \
   --plugin wal2json \
   --filters ~/pgcopydb/scripts/pgcopydb.pgbench.filter \
   2>&1 | tee $PGCOPYDB_LOG_DIR/pgcopydb_clone_follow_wal2json_filters_pgbench.log

Log file:

$ cat $PGCOPYDB_LOG_DIR/pgcopydb_clone_follow_wal2json_filters_pgbench.log
2024-01-08 20:51:04 565649 INFO   main.c:138                Running pgcopydb version 0.14 from "/usr/pgsql-15/bin/pgcopydb"
2024-01-08 20:51:04 565649 INFO   cli_common.c:1239         [SOURCE] Copying database from "postgres://repuser@drtest-pg1/pgbench?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60"
2024-01-08 20:51:04 565649 INFO   cli_common.c:1240         [TARGET] Copying database into "postgres://repuser@pg4-test/pgbench?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60"
2024-01-08 20:51:04 565649 INFO   pgsql.c:3683              Created logical replication slot "pgcopydb_rgs_wal2json" with plugin "wal2json" at D/E2EA6FE8 and exported snapshot 00000008-0000ECF0-1
2024-01-08 20:51:04 565649 INFO   ld_stream.c:2544          create schema if not exists pgcopydb
2024-01-08 20:51:04 565649 INFO   ld_stream.c:2544          drop table if exists pgcopydb.sentinel
2024-01-08 20:51:04 565649 WARN   pgsql.c:826               NOTICE:  table "sentinel" does not exist, skipping
2024-01-08 20:51:04 565649 INFO   ld_stream.c:2544          create table pgcopydb.sentinel(startpos pg_lsn, endpos pg_lsn, apply bool,  write_lsn pg_lsn, flush_lsn pg_lsn, replay_lsn pg_lsn)
2024-01-08 20:51:04 565649 INFO   ld_stream.c:2453          Created logical replication origin "pgcopydb" at LSN D/E2EA6FE8
2024-01-08 20:51:04 565654 INFO   cli_clone_follow.c:530    STEP 1: fetch source database tables, indexes, and sequences
2024-01-08 20:51:04 565654 INFO   copydb_schema.c:143       Fetched information for 1 extensions
2024-01-08 20:51:04 565658 INFO   ld_apply.c:355            Waiting until the pgcopydb sentinel apply is enabled
2024-01-08 20:51:04 565656 INFO   ld_stream.c:649           Resuming streaming at LSN D/E2EA6FE8 from replication slot "pgcopydb_rgs_wal2json"
2024-01-08 20:51:04 565654 INFO   copydb_schema.c:440       Fetched information for 4 tables, with an estimated total of 1000  tuples and 128 MB
2024-01-08 20:51:04 565654 INFO   copydb_schema.c:466       Fetched information for 3 indexes
2024-01-08 20:51:04 565654 INFO   sequences.c:41            Fetching information for 0 sequences
2024-01-08 20:51:04 565656 INFO   pgsql.c:4400              Reported write_lsn D/E2EA6FE8, flush_lsn D/E2EA6FE8, replay_lsn 0/0
2024-01-08 20:51:04 565654 INFO   cli_clone_follow.c:547    STEP 2: dump the source database schema (pre/post data)
2024-01-08 20:51:04 565654 INFO   pgcmd.c:473                /usr/pgsql-15/bin/pg_dump -Fc --snapshot 00000008-0000ECF0-1 --section pre-data --file /transfer/pg11/pgbench/schema/pre.dump 'postgres://repuser@drtest-pg1/pgbench?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60'
2024-01-08 20:51:04 565654 INFO   pgcmd.c:473                /usr/pgsql-15/bin/pg_dump -Fc --snapshot 00000008-0000ECF0-1 --section post-data --file /transfer/pg11/pgbench/schema/post.dump 'postgres://repuser@drtest-pg1/pgbench?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60'
2024-01-08 20:51:04 565654 INFO   cli_clone_follow.c:559    STEP 3: restore the pre-data section to the target database
2024-01-08 20:51:04 565654 INFO   pgcmd.c:901                /usr/pgsql-15/bin/pg_restore --dbname 'postgres://repuser@pg4-test/pgbench?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60' --single-transaction --use-list /transfer/pg11/pgbench/schema/pre-filtered.list /transfer/pg11/pgbench/schema/pre.dump
2024-01-08 20:51:04 565654 INFO   indexes.c:30              STEP 6: starting 4 CREATE INDEX processes
2024-01-08 20:51:04 565654 INFO   indexes.c:31              STEP 7: constraints are built by the CREATE INDEX processes
2024-01-08 20:51:04 565666 INFO   table-data.c:420          STEP 4: starting 4 table data COPY processes
2024-01-08 20:51:04 565654 INFO   vacuum.c:32               STEP 8: starting 4 VACUUM processes
2024-01-08 20:51:04 565654 INFO   sequences.c:104           STEP 9: reset sequences values
2024-01-08 20:51:04 565679 INFO   blobs.c:189               STEP 5: starting 4 Large Objects workers
2024-01-08 20:51:04 565680 INFO   sequences.c:172           Reset sequences values on the target database
2024-01-08 20:51:04 565679 INFO   blobs.c:501               Added 0 large objects to the queue
2024-01-08 20:51:06 565654 INFO   cli_clone_follow.c:581    STEP 10: restore the post-data section to the target database
2024-01-08 20:51:06 565654 INFO   pgcmd.c:901                /usr/pgsql-15/bin/pg_restore --dbname 'postgres://repuser@pg4-test/pgbench?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60' --single-transaction --use-list /transfer/pg11/pgbench/schema/post-filtered.list /transfer/pg11/pgbench/schema/post.dump
2024-01-08 20:51:06 565654 INFO   cli_clone_follow.c:599    Updating the pgcopydb.sentinel to enable applying changes

  OID | Schema |             Name | copy duration | indexes | create index duration
------+--------+------------------+---------------+---------+----------------------
39425 | public | pgbench_accounts |         1s471 |       1 |                 228ms
39428 | public | pgbench_branches |          29ms |       1 |                  24ms
39422 | public |  pgbench_tellers |          33ms |       1 |                  16ms
39419 | public |  pgbench_history |          29ms |       0 |                   0ms

                                               Step   Connection    Duration   Concurrency
 --------------------------------------------------   ----------  ----------  ------------
                                        Dump Schema       source        95ms             1
   Catalog Queries (table ordering, filtering, etc)       source       317ms             1
                                     Prepare Schema       target        39ms             1
      COPY, INDEX, CONSTRAINTS, VACUUM (wall clock)         both       1s806         4 + 8
                                  COPY (cumulative)         both       1s562             4
                         Large Objects (cumulative)         both       161ms             4
             CREATE INDEX, CONSTRAINTS (cumulative)       target       268ms             4
                                    Finalize Schema       target       197ms             1
 --------------------------------------------------   ----------  ----------  ------------
                          Total Wall Clock Duration         both       2s478         4 + 8
 --------------------------------------------------   ----------  ----------  ------------

2024-01-08 20:51:07 565658 INFO   ld_apply.c:365            The pgcopydb sentinel has enabled applying changes
2024-01-08 20:51:07 565658 INFO   ld_apply.c:1277           Setting up previous LSN from replication origin "pgcopydb" progress at D/E2EA6FE8
2024-01-08 20:51:07 565658 INFO   ld_apply.c:254            Catchup-up with changes from LSN D/E2EA6FE8
2024-01-08 20:51:07 565658 INFO   ld_apply.c:93             File "/transfer/pg11/pgbench/cdc/000000030000000D000000E2.sql" does not exists yet, exit
2024-01-08 20:51:07 565656 ERROR  pgsql.c:1511              Postgres query was interrupted: update pgcopydb.sentinel set startpos = $2, write_lsn = $1, flush_lsn = $2 returning startpos, endpos, apply, write_lsn, flush_lsn, replay_lsn
2024-01-08 20:51:07 565656 ERROR  pgsql.c:5336              Failed to fetch pgcopydb.sentinel current values
2024-01-08 20:51:07 565656 INFO   pgsql.c:4400              Reported write_lsn D/E2F00CC8, flush_lsn D/E2F00CC8, replay_lsn 0/0
2024-01-08 20:51:07 565656 INFO   ld_stream.c:522           Streamed up to write_lsn D/E2F00CC8, flush_lsn D/E2F00CC8, stopping: endpos is 0/0
2024-01-08 20:51:07 565656 INFO   follow.c:674              Prefetch process has terminated
2024-01-08 20:51:07 565655 INFO   follow.c:616              Subprocesses for prefetch, transform, and catchup have now all exited
2024-01-08 20:51:07 565655 INFO   follow.c:426              Catching-up from existing on-disk files
2024-01-08 20:51:07 565714 INFO   ld_transform.c:983        Transformed 2 JSON messages into SQL file "/transfer/pg11/pgbench/cdc/000000030000000D000000E2.sql"
2024-01-08 20:51:08 565715 INFO   ld_apply.c:1277           Setting up previous LSN from replication origin "pgcopydb" progress at D/E2EA6FE8
2024-01-08 20:51:08 565715 INFO   ld_apply.c:254            Catchup-up with changes from LSN D/E2EA6FE8
2024-01-08 20:51:08 565715 INFO   ld_apply.c:540            Replaying changes from file "/transfer/pg11/pgbench/cdc/000000030000000D000000E2.sql"
2024-01-08 20:51:08 565715 INFO   ld_apply.c:131            Apply reached D/E2F00CC8 in "/transfer/pg11/pgbench/cdc/000000030000000D000000E2.sql"
2024-01-08 20:51:08 565715 INFO   ld_apply.c:149            Reached end of file "/transfer/pg11/pgbench/cdc/000000030000000D000000E2.sql" at D/E2F00CC8.
2024-01-08 20:51:08 565655 INFO   follow.c:230              Current sentinel replay_lsn is D/E2F00CC8
2024-01-08 20:51:08 565655 INFO   follow.c:369              Restarting logical decoding follower in replay mode
2024-01-08 20:51:08 565718 INFO   ld_stream.c:2014          Resuming streaming from latest file "/transfer/pg11/pgbench/cdc/000000030000000D000000E2.json"
2024-01-08 20:51:08 565719 INFO   ld_transform.c:983        Transformed 2 JSON messages into SQL file "/transfer/pg11/pgbench/cdc/000000030000000D000000E2.sql"
2024-01-08 20:51:08 565718 INFO   ld_stream.c:694           Resuming streaming at LSN D/E2F00CC8 from first message with that LSN read in JSON file "/transfer/pg11/pgbench/cdc/000000030000000D000000E2.json", line 0
2024-01-08 20:51:08 565720 INFO   ld_apply.c:1286           Setting up previous LSN from replication origin "pgcopydb" progress at D/E2EA6FE8, overriding previous value D/E2F00CC8
2024-01-08 20:51:08 565720 INFO   ld_apply.c:254            Replaying changes from LSN D/E2EA6FE8
2024-01-08 20:51:08 565718 INFO   pgsql.c:4400              Reported write_lsn D/E2F00CC8, flush_lsn D/E2F00CC8, replay_lsn D/E2EC3BD8
2024-01-08 20:51:19 565718 INFO   pgsql.c:4400              Reported write_lsn D/E2F013F0, flush_lsn D/E2F00CC8, replay_lsn D/E2EC3BD8
...

pg15:

$ psql -d pgbench
psql (15.5)

 \d
                List of relations
 Schema |       Name       |   Type   |  Owner
--------+------------------+----------+----------
 public | pgbench_accounts | table    | postgres
 public | pgbench_branches | table    | postgres
 public | pgbench_history  | table    | postgres
 public | pgbench_tellers  | table    | postgres
 public | table_no_pk_seq  | sequence | postgres

??? public | table_no_pk_seq | sequence | postgres

Table public.table_no_pk is excluded, why does sequence public.table_no_pk_seq exist?

pg11:

$ psql -d pgbench

 \d public.table_no_pk
                           Table "public.table_no_pk"
 Column |  Type   | Collation | Nullable |               Default
--------+---------+-----------+----------+--------------------------------------
 x      | integer |           | not null | nextval('table_no_pk_seq'::regclass)
 y      | text    |           |          |

Insert data to public.table_no_pk

INSERT INTO public.table_no_pk (y) 
SELECT 'xxxxx' || i::text
FROM generate_series(1, 10) AS t(i); 
INSERT 0 10

But there are errors in the pgcopydb session:

2024-01-08 20:57:00 565718 INFO   pgsql.c:4400              Reported write_lsn D/E2F3EE50, flush_lsn D/E2F32248, replay_lsn D/E2F31C20
2024-01-08 20:57:03 565720 ERROR  pgsql.c:1944              [TARGET 565721] ERROR:  relation "public.table_no_pk" does not exist
2024-01-08 20:57:03 565720 ERROR  pgsql.c:1944              [TARGET 565721] LINE 1: INSERT INTO public.table_no_pk ("x", "y") overriding system ...
2024-01-08 20:57:03 565720 ERROR  pgsql.c:1944              [TARGET 565721]                     ^
2024-01-08 20:57:03 565720 ERROR  ld_replay.c:93            Failed to transform JSON messages from input stream, see above for details
2024-01-08 20:57:03 565655 ERROR  follow.c:964              Subprocess catchup with pid 565720 has exited with error code 12
2024-01-08 20:57:03 565718 ERROR  pgsql.c:1511              Postgres query was interrupted: update pgcopydb.sentinel set startpos = $2, write_lsn = $1, flush_lsn = $2 returning startpos, endpos, apply, write_lsn, flush_lsn, replay_lsn
2024-01-08 20:57:03 565718 ERROR  pgsql.c:5336              Failed to fetch pgcopydb.sentinel current values
2024-01-08 20:57:03 565718 INFO   pgsql.c:4400              Reported write_lsn D/E2F3F5C0, flush_lsn D/E2F3F5C0, replay_lsn D/E2F31C20
2024-01-08 20:57:03 565718 INFO   ld_stream.c:522           Streamed up to write_lsn D/E2F3F5C0, flush_lsn D/E2F3F5C0, stopping: endpos is 0/0
2024-01-08 20:57:03 565718 INFO   follow.c:663              Prefetch process has terminated
dimitri commented 9 months ago

See #614 : work-in-progress.