dimitri / pgcopydb

Copy a Postgres database to a target Postgres server (pg_dump | pg_restore on steroids)
Other
1.15k stars 75 forks source link

Errors with decoding #241

Closed aborschev closed 1 year ago

aborschev commented 1 year ago

Hi! I've got a test stand, with 2 VMs Debian Bulleye 4 CPU 4 GB RAM, one is with source DB, another-target. pgcopydb 0.11 is on target host.

Source: PG15 vanilla. 10 GB of some data + pgbench. I have emulated workload with pgbench:

postgres@borschev-pg-copydb1:~$ /usr/lib/postgresql/10/bin/pgbench --builtin=tpcb-like --client=4 --jobs=4 --time=600 --rate=20 --port=5433 --host=localhost --progress=20  --user=pgbench demo
starting vacuum...end.
progress: 20.0 s, 10.1 tps, lat 10.034 ms stddev 16.880, lag 0.896 ms
progress: 40.0 s, 10.0 tps, lat 11.158 ms stddev 27.644, lag 0.611 ms
...

I tried pgcopydb both with --plugin=test_decoding.

postgres@borschev-pg-copydb2:~$ pgcopydb clone --table-jobs=4 --index-jobs=4 --plugin=test_decoding --source="postgresql://postgres:pgpass@192.168.21.10:5433/demo" --target="postgresql://postgres:pgpass@localhost:5432/demo" --follow
12:54:18 42260 INFO   Running pgcopydb version 0.11-1.pgdg110+1 from "/usr/bin/pgcopydb"
12:54:18 42260 INFO   Using default origin node name "pgcopydb"
12:54:18 42260 INFO   [SOURCE] Copying database from "postgres://postgres@192.168.21.10:5433/demo?password=****"
12:54:18 42260 INFO   [TARGET] Copying database into "postgres://postgres@localhost:5432/demo?password=****"
12:54:18 42260 INFO   Exported snapshot "00000007-0000095F-1" from the source database
12:54:18 42260 INFO   Created logical replication slot "pgcopydb" with plugin "test_decoding" at LSN 2/BCB3CF68
12:54:18 42260 INFO   create schema if not exists pgcopydb
12:54:18 42260 INFO   drop table if exists pgcopydb.sentinel
12:54:18 42260 WARN   NOTICE:  table "sentinel" does not exist, skipping
12:54:18 42260 INFO   create table pgcopydb.sentinel(startpos pg_lsn, endpos pg_lsn, apply bool,  write_lsn pg_lsn, flush_lsn pg_lsn, replay_lsn pg_lsn)
12:54:18 42260 INFO   Created logical replication origin "pgcopydb" at LSN 2/BCB3CF68
12:54:18 42264 INFO   STEP 1: dump the source database schema (pre/post data)
12:54:18 42264 INFO    /usr/bin/pg_dump -Fc --snapshot 00000007-0000095F-1 --section pre-data --file /tmp/pgcopydb/schema/pre.dump 'postgres://postgres@192.168.21.10:5433/demo?'
12:54:18 42268 INFO   Resuming streaming at LSN 2/BCB3CF68 from replication slot "pgcopydb"
12:54:18 42270 INFO   Waiting until the pgcopydb sentinel apply is enabled
12:54:18 42264 INFO    /usr/bin/pg_dump -Fc --snapshot 00000007-0000095F-1 --section post-data --file /tmp/pgcopydb/schema/post.dump 'postgres://postgres@192.168.21.10:5433/demo?'
12:54:18 42264 INFO   STEP 2: fetch source database tables, indexes, and sequences
12:54:18 42264 INFO   Fetched information for 1 extensions
12:54:18 42264 WARN   NOTICE:  schema "pgcopydb" already exists, skipping
12:54:18 42264 INFO   Fetched information for 36 tables, with an estimated total of 89 million tuples and 6939 MB
12:54:18 42264 INFO   Fetched information for 47 indexes
12:54:18 42264 INFO   Fetching information for 4 sequences
12:54:18 42264 INFO   STEP 3: restore the pre-data section to the target database
12:54:18 42264 INFO    /usr/bin/pg_restore --dbname 'postgres://postgres@localhost:5432/demo?' --single-transaction --use-list /tmp/pgcopydb/schema/pre.list /tmp/pgcopydb/schema/pre.dump
12:54:19 42264 INFO   STEP 4: copy data from source to target in 4 sub-processes
12:54:19 42264 INFO   STEP 5: copy Large Objects (BLOBs) in 1 sub-process
12:54:19 42264 INFO   STEP 6: starting 4 CREATE INDEX processes
12:54:19 42264 INFO   STEP 7: constraints are built by the CREATE INDEX processes
12:54:19 42264 INFO   STEP 8: starting 4 VACUUM processes
12:54:19 42264 INFO   STEP 9: reset sequences values
12:54:19 42287 INFO   Reset sequences values on the target database
12:54:19 42278 INFO   Copying large objects
12:54:28 42268 INFO   Reported write_lsn 2/BCDE77F8, flush_lsn 0/0, replay_lsn 0/0
12:54:34 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_accounts: UPDATE: aid[integer]:1225283 bid[integer]:13 abalance[integer]:763 filler[character]:'                                                                                    '
12:54:34 42269 ERROR  Failed to parse test_decoding message, see above for details
12:54:34 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/BCB55238","timestamp":"2023-03-30 09:54:18.603370+0300","message":"table public.pgbench_accounts: UPDATE: aid[integer]:1225283 bid[integer]:13 abalance[integer]:763 filler[character]:'                                                                                    '"}
12:54:39 42268 INFO   Reported write_lsn 2/BD0CEB90, flush_lsn 2/BCDE77F8, replay_lsn 0/0
12:54:48 42268 INFO   Reported write_lsn 2/BD3BC068, flush_lsn 2/BD0CEB90, replay_lsn 0/0
12:54:58 42268 INFO   Reported write_lsn 2/BD6DA2C0, flush_lsn 2/BD3BC068, replay_lsn 0/0
12:55:08 42268 INFO   Reported write_lsn 2/BD990930, flush_lsn 2/BD6DA2C0, replay_lsn 0/0
12:55:18 42268 INFO   Reported write_lsn 2/BDC4C820, flush_lsn 2/BD990930, replay_lsn 0/0
12:55:29 42268 INFO   Reported write_lsn 2/BDEC63D0, flush_lsn 2/BDC4C820, replay_lsn 0/0
12:55:33 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_tellers: UPDATE: tid[integer]:51 bid[integer]:6 tbalance[integer]:-24000 filler[character]:null
12:55:33 42269 ERROR  Failed to parse test_decoding message, see above for details
12:55:33 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/BD000F40","timestamp":"2023-03-30 09:54:34.985652+0300","message":"table public.pgbench_tellers: UPDATE: tid[integer]:51 bid[integer]:6 tbalance[integer]:-24000 filler[character]:null"}
12:55:38 42268 INFO   Reported write_lsn 2/BE1C1C68, flush_lsn 2/BDEC63D0, replay_lsn 0/0
12:55:49 42268 INFO   Reported write_lsn 2/BE482B40, flush_lsn 2/BE1C1C68, replay_lsn 0/0
12:55:59 42268 INFO   Reported write_lsn 2/BE70A488, flush_lsn 2/BE482B40, replay_lsn 0/0
12:56:09 42268 INFO   Reported write_lsn 2/BE9B9420, flush_lsn 2/BE70A488, replay_lsn 0/0
12:56:19 42268 INFO   Reported write_lsn 2/BEC11310, flush_lsn 2/BE9B9420, replay_lsn 0/0
12:56:29 42268 INFO   Reported write_lsn 2/BEE58F28, flush_lsn 2/BEC11310, replay_lsn 0/0
12:56:35 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_tellers: UPDATE: tid[integer]:199 bid[integer]:20 tbalance[integer]:44117 filler[character]:null
12:56:35 42269 ERROR  Failed to parse test_decoding message, see above for details
12:56:35 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/BE001C80","timestamp":"2023-03-30 09:55:33.293485+0300","message":"table public.pgbench_tellers: UPDATE: tid[integer]:199 bid[integer]:20 tbalance[integer]:44117 filler[character]:null"}
12:56:35 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_tellers: UPDATE: tid[integer]:250 bid[integer]:25 tbalance[integer]:72403 filler[character]:null
12:56:35 42269 ERROR  Failed to parse test_decoding message, see above for details
12:56:35 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/BF002540","timestamp":"2023-03-30 09:56:35.777511+0300","message":"table public.pgbench_tellers: UPDATE: tid[integer]:250 bid[integer]:25 tbalance[integer]:72403 filler[character]:null"}
12:56:35 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_tellers: UPDATE: tid[integer]:199 bid[integer]:20 tbalance[integer]:44117 filler[character]:null
12:56:35 42269 ERROR  Failed to parse test_decoding message, see above for details
12:56:35 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/BE001C80","timestamp":"2023-03-30 09:55:33.293485+0300","message":"table public.pgbench_tellers: UPDATE: tid[integer]:199 bid[integer]:20 tbalance[integer]:44117 filler[character]:null"}
12:56:39 42268 INFO   Reported write_lsn 2/BF0A6688, flush_lsn 2/BEE58F28, replay_lsn 0/0
12:56:50 42268 INFO   Reported write_lsn 2/BF307870, flush_lsn 2/BF0A6688, replay_lsn 0/0
12:56:59 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_tellers: UPDATE: tid[integer]:250 bid[integer]:25 tbalance[integer]:72403 filler[character]:null
12:56:59 42269 ERROR  Failed to parse test_decoding message, see above for details
12:56:59 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/BF002540","timestamp":"2023-03-30 09:56:35.777511+0300","message":"table public.pgbench_tellers: UPDATE: tid[integer]:250 bid[integer]:25 tbalance[integer]:72403 filler[character]:null"}
12:56:59 42268 INFO   Reported write_lsn 2/C0059C48, flush_lsn 2/BF307870, replay_lsn 0/0
12:57:09 42268 INFO   Reported write_lsn 2/C0CA7B18, flush_lsn 2/C00677E8, replay_lsn 0/0
12:57:12 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_accounts: UPDATE: aid[integer]:1379949 bid[integer]:14 abalance[integer]:-2733 filler[character]:'                                                                                    '
12:57:12 42269 ERROR  Failed to parse test_decoding message, see above for details
12:57:12 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/C000AA90","timestamp":"2023-03-30 09:56:59.388765+0300","message":"table public.pgbench_accounts: UPDATE: aid[integer]:1379949 bid[integer]:14 abalance[integer]:-2733 filler[character]:'                                                                                    '"}
12:57:19 42268 INFO   Reported write_lsn 2/C1736F60, flush_lsn 2/C0CAF898, replay_lsn 0/0
12:57:27 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_accounts: UPDATE: aid[integer]:2881924 bid[integer]:29 abalance[integer]:-2263 filler[character]:'                                                                                    '
12:57:27 42269 ERROR  Failed to parse test_decoding message, see above for details
12:57:27 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/C1007AD8","timestamp":"2023-03-30 09:57:12.514890+0300","message":"table public.pgbench_accounts: UPDATE: aid[integer]:2881924 bid[integer]:29 abalance[integer]:-2263 filler[character]:'                                                                                    '"}
12:57:29 42268 INFO   Reported write_lsn 2/C2216AF8, flush_lsn 2/C1736F60, replay_lsn 0/0
12:57:40 42268 INFO   Reported write_lsn 2/C2D22900, flush_lsn 2/C2216AF8, replay_lsn 0/0
12:57:43 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_accounts: UPDATE: aid[integer]:347612 bid[integer]:4 abalance[integer]:125 filler[character]:'                                                                                    '
12:57:43 42269 ERROR  Failed to parse test_decoding message, see above for details
12:57:43 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/C2000B08","timestamp":"2023-03-30 09:57:27.856463+0300","message":"table public.pgbench_accounts: UPDATE: aid[integer]:347612 bid[integer]:4 abalance[integer]:125 filler[character]:'                                                                                    '"}
12:57:50 42268 INFO   Reported write_lsn 2/C34B6598, flush_lsn 2/C2D22900, replay_lsn 0/0
12:58:00 42268 INFO   Reported write_lsn 2/C3C68408, flush_lsn 2/C34B6598, replay_lsn 0/0
12:58:10 42268 INFO   Reported write_lsn 2/C3F11508, flush_lsn 2/C3C68408, replay_lsn 0/0
12:58:14 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_accounts: UPDATE: aid[integer]:2322293 bid[integer]:24 abalance[integer]:-260 filler[character]:'                                                                                    '
12:58:14 42269 ERROR  Failed to parse test_decoding message, see above for details
12:58:14 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/C30085E8","timestamp":"2023-03-30 09:57:43.993683+0300","message":"table public.pgbench_accounts: UPDATE: aid[integer]:2322293 bid[integer]:24 abalance[integer]:-260 filler[character]:'                                                                                    '"}
12:58:16 42264 INFO   COPY phase is done, now waiting for vacuum, index, blob, and sequences processes
12:58:20 42268 INFO   Reported write_lsn 2/C41BCD08, flush_lsn 2/C3F11508, replay_lsn 0/0
12:58:30 42268 INFO   Reported write_lsn 2/C44157D0, flush_lsn 2/C41BCD08, replay_lsn 0/0
12:58:40 42268 INFO   Reported write_lsn 2/C469A818, flush_lsn 2/C44157D0, replay_lsn 0/0
12:58:50 42268 INFO   Reported write_lsn 2/C4931DC0, flush_lsn 2/C469A818, replay_lsn 0/0
12:58:58 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_tellers: UPDATE: tid[integer]:59 bid[integer]:6 tbalance[integer]:95234 filler[character]:null
12:58:58 42269 ERROR  Failed to parse test_decoding message, see above for details
12:58:58 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/C4000928","timestamp":"2023-03-30 09:58:14.170191+0300","message":"table public.pgbench_tellers: UPDATE: tid[integer]:59 bid[integer]:6 tbalance[integer]:95234 filler[character]:null"}
12:58:58 42264 INFO   STEP 10: restore the post-data section to the target database
12:58:58 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_accounts: UPDATE: aid[integer]:1225283 bid[integer]:13 abalance[integer]:763 filler[character]:'                                                                                    '
12:58:58 42269 ERROR  Failed to parse test_decoding message, see above for details
12:58:58 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/BCB55238","timestamp":"2023-03-30 09:54:18.603370+0300","message":"table public.pgbench_accounts: UPDATE: aid[integer]:1225283 bid[integer]:13 abalance[integer]:763 filler[character]:'                                                                                    '"}
12:58:58 42264 INFO    /usr/bin/pg_restore --dbname 'postgres://postgres@localhost:5432/demo?' --single-transaction --use-list /tmp/pgcopydb/schema/post.list /tmp/pgcopydb/schema/post.dump
12:59:00 42268 INFO   Reported write_lsn 2/C4C0FDD8, flush_lsn 2/C4931DC0, replay_lsn 0/0
12:59:10 42268 INFO   Reported write_lsn 2/C4EBD028, flush_lsn 2/C4C0FDD8, replay_lsn 0/0
12:59:15 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_tellers: UPDATE: tid[integer]:59 bid[integer]:6 tbalance[integer]:95234 filler[character]:null
12:59:15 42269 ERROR  Failed to parse test_decoding message, see above for details
12:59:15 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/C4000928","timestamp":"2023-03-30 09:58:14.170191+0300","message":"table public.pgbench_tellers: UPDATE: tid[integer]:59 bid[integer]:6 tbalance[integer]:95234 filler[character]:null"}
12:59:20 42268 INFO   Reported write_lsn 2/C50F63A8, flush_lsn 2/C4EBD028, replay_lsn 0/0
12:59:30 42268 INFO   Reported write_lsn 2/C53A1450, flush_lsn 2/C50F63A8, replay_lsn 0/0
12:59:40 42268 INFO   Reported write_lsn 2/C5621E78, flush_lsn 2/C53A1450, replay_lsn 0/0
12:59:50 42268 INFO   Reported write_lsn 2/C589F2C0, flush_lsn 2/C5621E78, replay_lsn 0/0
13:00:00 42268 INFO   Reported write_lsn 2/C5B0B5F8, flush_lsn 2/C589F2C0, replay_lsn 0/0
13:00:10 42268 INFO   Reported write_lsn 2/C5D4F6A0, flush_lsn 2/C5B0B5F8, replay_lsn 0/0
13:00:20 42268 INFO   Reported write_lsn 2/C5FD7E70, flush_lsn 2/C5D4F6A0, replay_lsn 0/0
13:00:21 42269 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_accounts: UPDATE: aid[integer]:2575525 bid[integer]:26 abalance[integer]:1689 filler[character]:'                                                                                    '
13:00:21 42269 ERROR  Failed to parse test_decoding message, see above for details
13:00:21 42269 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/C5001290","timestamp":"2023-03-30 09:59:15.857050+0300","message":"table public.pgbench_accounts: UPDATE: aid[integer]:2575525 bid[integer]:26 abalance[integer]:1689 filler[character]:'                                                                                    '"}
13:00:30 42268 INFO   Reported write_lsn 2/C6225CB8, flush_lsn 2/C5FD7E70, replay_lsn 0/0
13:00:31 42264 INFO   Updating the pgcopydb.sentinel to enable applying changes

  OID |    Schema |             Name | copy duration | indexes | create index duration
------+-----------+------------------+---------------+---------+----------------------
24709 |  bookings |   ticket_flights |           38s |       1 |                 3m05s
24769 | bookings1 |   ticket_flights |           37s |       1 |                   42s
24829 | bookings2 |   ticket_flights |           38s |       1 |                 1m57s
24889 | bookings3 |   ticket_flights |           38s |       1 |                 1m29s
24894 | bookings3 |          tickets |           47s |       1 |                   18s
24834 | bookings2 |          tickets |           47s |       1 |                   12s
24714 |  bookings |          tickets |           46s |       1 |                   14s
24774 | bookings1 |          tickets |           45s |       1 |                   18s
24681 |  bookings |  boarding_passes |         1m05s |       3 |                 2m42s
24741 | bookings1 |  boarding_passes |         1m08s |       3 |                 1m10s
24801 | bookings2 |  boarding_passes |         1m08s |       3 |                 1m18s
24861 | bookings3 |  boarding_passes |         1m08s |       3 |                 1m05s
24900 |    public | pgbench_accounts |         1m13s |       1 |                 4s690
24684 |  bookings |         bookings |           36s |       1 |                 5s954
24744 | bookings1 |         bookings |           37s |       1 |                 5s016
24804 | bookings2 |         bookings |           38s |       1 |                 4s222
24864 | bookings3 |         bookings |           47s |       1 |                 4s881
24687 |  bookings |          flights |           27s |       2 |                 1s006
24747 | bookings1 |          flights |           27s |       2 |                 931ms
24807 | bookings2 |          flights |           11s |       2 |                 965ms
24867 | bookings3 |          flights |           10s |       2 |                 1s145
24909 |    public |  pgbench_tellers |         157ms |       1 |                  67ms
24903 |    public | pgbench_branches |         114ms |       1 |                 220ms
24705 |  bookings |            seats |         182ms |       1 |                 188ms
24765 | bookings1 |            seats |         156ms |       1 |                 112ms
24825 | bookings2 |            seats |         125ms |       1 |                 503ms
24885 | bookings3 |            seats |         173ms |       1 |                 426ms
24906 |    public |  pgbench_history |         167ms |       0 |                   0ms
24671 |  bookings |    airports_data |         427ms |       1 |                 266ms
24731 | bookings1 |    airports_data |         1s558 |       1 |                 205ms
24791 | bookings2 |    airports_data |         717ms |       1 |                 135ms
24851 | bookings3 |    airports_data |         908ms |       1 |                 249ms
24660 |  bookings |   aircrafts_data |         1s079 |       1 |                 315ms
24720 | bookings1 |   aircrafts_data |         436ms |       1 |                  99ms
24780 | bookings2 |   aircrafts_data |         773ms |       1 |                 216ms
24840 | bookings3 |   aircrafts_data |         651ms |       1 |                  88ms

                                               Step   Connection    Duration   Concurrency
 --------------------------------------------------   ----------  ----------  ------------
                                        Dump Schema       source       324ms             1
   Catalog Queries (table ordering, filtering, etc)       source        35ms             1
                                     Prepare Schema       target       252ms             1
      COPY, INDEX, CONSTRAINTS, VACUUM (wall clock)         both       4m39s         4 + 8
                                  COPY (cumulative)         both      15m29s             4
                         Large Objects (cumulative)         both        30ms             1
             CREATE INDEX, CONSTRAINTS (cumulative)       target      15m07s             4
                                    Finalize Schema       target       1m33s             1
 --------------------------------------------------   ----------  ----------  ------------
                          Total Wall Clock Duration         both       6m13s         4 + 8
 --------------------------------------------------   ----------  ----------  ------------

13:00:32 42270 INFO   The pgcopydb sentinel has enabled applying changes
13:00:32 42270 INFO   Catching-up with changes from LSN 2/BCB3CF68
13:00:32 42270 INFO   File "/home/postgres/.local/share/pgcopydb/0000000100000002000000BC.sql" does not exists yet, exit
13:00:33 42268 INFO   Reported write_lsn 2/C62C39E8, flush_lsn 2/C62C39E8, replay_lsn 0/0
13:00:33 42268 WARN   Streaming got interrupted at 2/C62C39E8 after processing 45478 messages
13:00:33 42267 INFO   Subprocesses for prefetch, transform, and catchup have now all exited
13:00:33 42267 INFO   Current sentinel replay_lsn is 0/0
13:00:33 42267 INFO   Catching-up from existing on-disk files
13:00:33 42267 INFO   Catching-up with changes from LSN 2/BCB3CF68
13:00:33 42267 INFO   File "/home/postgres/.local/share/pgcopydb/0000000100000002000000BC.sql" does not exists yet, exit
13:00:33 42267 INFO   Current sentinel replay_lsn is 0/0
13:00:33 42267 INFO   Restarting logical decoding follower in replay mode
13:00:33 42543 INFO   Replaying changes from LSN 2/BCB3CF68
13:00:33 42541 INFO   Resuming streaming at LSN 2/C62C39E8 from replication slot "pgcopydb"
13:00:33 42542 ERROR  Failed to find old-key in UPDATE message: table public.pgbench_accounts: UPDATE: aid[integer]:55158 bid[integer]:1 abalance[integer]:-105 filler[character]:'                                                                                    '
13:00:33 42542 ERROR  Failed to parse test_decoding message, see above for details
13:00:33 42542 ERROR  Failed to parse JSON message: {"action":"U","xid":"0","lsn":"2/C62C5A70","timestamp":"2023-03-30 10:00:33.715971+0300","message":"table public.pgbench_accounts: UPDATE: aid[integer]:55158 bid[integer]:1 abalance[integer]:-105 filler[character]:'                                                                                    '"}
13:00:33 42542 ERROR  Failed to transform JSON messages from input stream, see above for details
13:00:33 42267 ERROR  Subprocess transform with pid 42542 has exited with error code 12
13:00:33 42541 INFO   Reported write_lsn 2/C62EB160, flush_lsn 2/C62EB160, replay_lsn 0/0
13:00:33 42541 INFO   Streaming is now finished after processing 82 messages
13:00:34 42267 ERROR  Some sub-process exited with errors, see above for details
13:00:34 42260 ERROR  follow process 42267 has terminated [12]
13:00:34 42260 INFO   Fetching information for 4 sequences
13:00:34 42260 INFO   Reset sequences values on the target database

If You propose to do some tests or ask for more diagnostics - I will try to do these tests with this setup.

dimitri commented 1 year ago

Hi @aborschev ; thanks for reporting this issue. The problem we face is that in some cases the logical decoding test_decoding plugin skips the old-key part of the UPDATE statement entirely. The client parts (pgcopydb, here) is then supposed to maintain an in-memory cache of all the primary keys of all the tables that receive UPDATE statements in order to be able to make sense of the UPDATE messages.

At the moment such a feature is not implemented in pgcopydb, which only knows how to parse complete UPDATE messages in the logical decoding output from test_decoding.

0x416e746f6e commented 1 year ago

Experiencing similar issue. Tried to use wal2json plugin instead => all seemed to work flawlessly, until I cared to look into cdc/xxxx.sql and found the following update statement there:

UPDATE ... SET ... WHERE "id" = 3.55061e+07;

Turns out it converted 35506143 bigint into a floating point!

However, checking corresponding JSON file:

{
  "action": "U",
  "xid": "635103562",
  "lsn": "B69/82588E0",
  "timestamp": "2023-04-20 11:17:02.410232+0000",
  "message": {
    "action": "U",
    "xid": 635103562,
    "schema": "public",
    "table": "xxx",
    "columns": [
      { "name": "id", "value": 35506143 },
      // ...
    ],
    "identity": [{ "name": "id", "value": 35506143 }]
  }
}

Should I open an issue?

0x416e746f6e commented 1 year ago

This seems to be related: https://github.com/dimitri/pgcopydb/issues/127