2ndQuadrant / pglogical

Logical Replication extension for PostgreSQL 17, 16, 15, 14, 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
http://2ndquadrant.com/en/resources/pglogical/
Other
1.01k stars 153 forks source link

Is there any plan for supporting REPLICA IDENTITY FULL? #340

Open houzi1099857978 opened 3 years ago

houzi1099857978 commented 3 years ago

I can't add a table with REPLICA IDENTITY FULL in publication,but I need the old data in subcription for app to handle the data,so is there any plain to support this?

petere commented 3 years ago

There are currently no plans to support this.

SteveDirschelTR commented 3 years ago

I am also curious why replica identity full isn't supported. Scenario:

At 10:00:00.001 row with PK pk123 is deleted from site A database. At 10:00:00.100 row with PK pk123 is updated on site B database.

Replication LAG is 1 second so both changes happened before replicated change arrived from the other site.

track_commit_timestamp = ON pglogical.conflict_resolution = last_update_wins.

Since the update happened after the delete the update should win. While testing this scenario the end result was the row was deleted from both site A and B.

In order to correctly handle this situation I believe you'd need REPLICA IDENTITY set to FULL so you get before values for all columns, not just the columns being updated along with the PK. The update that replicates to site A does not find the row so the update should get changed to an insert. Since REPLICA IDENTITY is set to FULL before values for all columns on the table should be available so there is enough information to change the update to an insert. The row should not get deleted on site B since the delete happened before the last update. End result is the matching row should exist on both sites.

Comments to this scenario?

houzi1099857978 commented 2 years ago

There are currently no plans to support this.

PostgreSQL has a hard limit on the page size. This means that values that are larger than around 8 KBs need to be stored by using TOAST storage. This impacts replication messages that are coming from the database. Values that were stored by using the TOAST mechanism and that have not been changed are not included in the message, unless they are part of the table’s replica identity. There is no safe way for Debezium to read the missing value out-of-bands directly from the database, as this would potentially lead to race conditions. Consequently, That means i can't use any logical decoding plugin to handle the problem when i use pglogcal, because pglogical doesn't support REPLICA IDENTITY FULL, that make me can't transfer data to kafka,is there any solution to handel this problem?

petere commented 2 years ago

Built-in logical replication supports REPLICA IDENTITY FULL.

houzi1099857978 commented 2 years ago

Built-in logical replication supports REPLICA IDENTITY FULL.

But built-in logical replication doesn't support two-side replication,we want to build a bicentric data center.

bonesmoses commented 2 years ago

For what it's worth, EDB provides Postgres Distributed for bi-directional replication that also supports REPLICA IDENTITY FULL. If that's what you meant by "bicentric", there's a purpose-built product made specifically for that scenario.