dimitri / pgcopydb

Copy a Postgres database to a target Postgres server (pg_dump | pg_restore on steroids)
Other
1.19k stars 78 forks source link

Fixes online migration with composite primary key #879

Open hanefi opened 2 months ago

hanefi commented 2 months ago

Refactor code to skip executing empty UPDATE statements

The logical_message_metadata_should_skip_statement function has been added to check if the logical message metadata should be skipped. This is necessary due to a bug in the wall2json output plugin that creates an UPDATE statement with an empty SET clause and places the actual SET clause in the WHERE clause. The function checks for this specific pattern and returns true if found, indicating that the statement should be skipped during execution.

In passing fixes a double free bug.

Fixes: dimitri/pgcopydb#750

dimitri commented 2 months ago

I am confused here:

  1. In #750 the reporter is using test_decoding, not wal2json.
  2. The comment here is mentioning a bug in wal2json: can we link to the wal2json issue?
  3. The PR does not address why it is safe to skip the UPDATE statement when we fail to format it -- one way to show that it's safe would be for the unit test to also run a data compare after the end of the CDC apply.

Also the PR uses the wrong spelling wall2json in multiple places.

From issue #750:

15:22:30.588 8683 ERROR  Failed to parse decoding message for UPDATE on table public.test_002: SET clause columns not found
15:22:30.589 8683 ERROR  Failed to parse UPDATE new-tuple columns for logical message table public.test_002: UPDATE: id[integer]:1 name[text]:'postgres'
15:22:30.589 8683 ERROR  Failed to parse test_decoding UPDATE message: table public.test_002: UPDATE: id[integer]:1 name[text]:'postgres'
15:22:30.589 8683 ERROR  Failed to parse test_decoding message, see above for details
arajkumar commented 1 month ago

@hanefi Looks like this PR and https://github.com/dimitri/pgcopydb/pull/883 addresses same issue?

As mentioned in the PR #883, this is not a problem in the plugin, instead the logic from ld_transform which skips columns from SET clause having same values as in WHERE clause leading to removal of all SET clause items.