pgpartman / pg_partman

Partition management extension for PostgreSQL
Other
2.05k stars 279 forks source link

no REPLICA IDENTITY error #537

Closed rsquaretrade closed 1 year ago

rsquaretrade commented 1 year ago

Hello. This might be a duplicate issue, but I'd like to post it. Today, I migrated a partitioned table from inheritance to native, the set up is this: the table in OLTP is NOT partitioned, it is replicated to the warehouse where the table is partitioned. I added a non-partitioned primary key in a template table, the control is a column_date, the child tables appear to be inheriting it. After the migration, error logs shows the following:

"the table "logical replication target relation ""public.table_name"" has neither REPLICA IDENTITY index nor PRIMARY KEY and published relation does not have REPLICA IDENTITY FULL"

Is there a workaround? Thank you in advance.

keithf4 commented 1 year ago

Can you double-check that all child tables have the primary key created? If you add the primary key to the template table after you create the partition set, it does not go back and create that primary key on any existing children, only new ones.

rsquaretrade commented 1 year ago

It is there, premake was only 1

warehouse=> \d+ public.audit_log_p2025 Table "public.audit_log_p2025" Column │ Type │ Collation │ Nullable │ Default │ Storage │ Compression │ Stats target │ Description ─────────────────────┼──────────────────────────┼───────────┼──────────┼─────────┼──────────┼─────────────┼──────────────┼───────────── id │ text │ │ not null │ │ extended │ │ │ audit_obj_id │ text │ │ not null │ │ extended │ │ │ change_time │ timestamp with time zone │ │ not null │ │ plain │ │ │ change_type_id │ text │ │ not null │ │ extended │ │ │ changed_by │ text │ │ │ │ extended │ │ │ changed_from_str │ text │ │ │ │ extended │ │ │ changed_to_str │ text │ │ │ │ extended │ │ │ changed_from_char │ character(1) │ │ │ │ extended │ │ │ changed_to_char │ character(1) │ │ │ │ extended │ │ │ changed_from_date │ timestamp with time zone │ │ │ │ plain │ │ │ changed_to_date │ timestamp with time zone │ │ │ │ plain │ │ │ comments │ text │ │ │ │ extended │ │ │ changed_from_number │ numeric(15,2) │ │ │ │ main │ │ │ changed_to_number │ numeric(15,2) │ │ │ │ main │ │ │ Partition of: audit_log_native_rollback FOR VALUES FROM ('2025-01-01 00:00:00-08') TO ('2026-01-01 00:00:00-08') Partition constraint: ((change_time IS NOT NULL) AND (change_time >= '2025-01-01 00:00:00-08'::timestamp with time zone) AND (change_time < '2026-01-01 00:00:00-08'::timestamp with time zone)) Indexes: "audit_log_p2025_pkey" PRIMARY KEY, btree (id) "audit_log_p2025_audit_obj_id_change_time_idx" btree (audit_obj_id, change_time) "audit_log_p2025_audit_obj_id_idx" btree (audit_obj_id) "audit_log_p2025_change_time_idx" btree (change_time) "audit_log_p2025_change_type_id_idx" btree (change_type_id) Access method: heap

I found this website regarding this specific error with 3 solutions which is to : https://mydbanotebook.org/post/replication-key/

  1. Add a unique index, as a test, I added with 2 columns (id, change_time). Then alter table withuniqueindex replica identity using index withuniqueindex_c_key; (I will set up replication and see if this works).
  2. alter table badmodel replica identity full; (but I'm not sure if there are performance implications using this).
  3. add a primary key (i already added in the template table).

I appreciate any suggestions as well from you.

keithf4 commented 1 year ago

Any chance you could share either the publication/subscription commands you're using to create the logical set or, if it's already set up, the configuration settings for it? Trying to set this up the same way that you are so I can try to recreate the situation and see what the problem is.

Also, what version of PostgreSQL are you running?

rsquaretrade commented 1 year ago

Replication has been set up awhile back, in postgresql.conf wal_level=replica, asynchronous. Basic set up is the table on the OLTP side is NOT partitioned, but the replicated table in the warehouse is partitioned. DB is on PG14

Publisher/Subscriber Set-up: (1) Create a publication on Publisher.

SCHEMA=generic_schema psql service=${SCHEMA}_prod -X <<DDL

CREATE PUBLICATION ${SCHEMA}_pub FOR TABLE $SCHEMA.wm_item_file , $SCHEMA.table1 , $SCHEMA.table2 , $SCHEMA.table3;

ALTER PUBLICATION ${SCHEMA}_pub OWNER TO replication; DDL

database_name=> \dRp+

(2) Create subscription on Subscriber.

SCHEMA=schema_name psql service=warehouse_prod -X <<DDL CREATE SUBSCRIPTION ${SCHEMA}_sub CONNECTION 'postgresql://replication@10.000.00.000:5432/database_name' PUBLICATION ${SCHEMA}_pub; DDL

(3) Create tables on subscriber using pg_dump

SCHEMA=schema_name psql service=dbname_prod -X <<DDL

ALTER PUBLICATION ${SCHEMA}_pub ADD TABLE $SCHEMA.table1 , $SCHEMA.table2 , $SCHEMA.table3; DDL

(4) Alter subscription on subscriber database.

SCHEMA=schema_name psql service=warehouse_prod -X <<DDL ALTER SUBSCRIPTION ${SCHEMA}_sub REFRESH PUBLICATION ; DDL

(5) Apply post-data section via pg_restore

keithf4 commented 1 year ago

So, I'm not able to get the same error that you're getting. I'm able to reproduce a similar one, but only if the published table is missing the primary key. And reading back on your error again, it does seem to indicate something similar. Do the tables on the publisher side have primary/unique keys?

SUBSCRIBER:

keith=# create table audit_log (id text, change_time timestamp with time zone) partition by range (change_time);

create table audit_log_template (like audit_log);

alter table audit_log_template add primary key (id);

select partman.create_parent('public.audit_log', 'change_time', 'native', '1 year', p_template_table := 'public.audit_log_template');
CREATE TABLE
CREATE TABLE
ALTER TABLE
 create_parent 
---------------
 t
(1 row)

keith=# \d+ audit_log
                                             Partitioned table "public.audit_log"
   Column    |           Type           | Collation | Nullable | Default | Storage  | Compression | Stats target | Description 
-------------+--------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
 id          | text                     |           |          |         | extended |             |              | 
 change_time | timestamp with time zone |           |          |         | plain    |             |              | 
Partition key: RANGE (change_time)
Partitions: audit_log_p2019 FOR VALUES FROM ('2019-01-01 00:00:00-05') TO ('2020-01-01 00:00:00-05'),
            audit_log_p2020 FOR VALUES FROM ('2020-01-01 00:00:00-05') TO ('2021-01-01 00:00:00-05'),
            audit_log_p2021 FOR VALUES FROM ('2021-01-01 00:00:00-05') TO ('2022-01-01 00:00:00-05'),
            audit_log_p2022 FOR VALUES FROM ('2022-01-01 00:00:00-05') TO ('2023-01-01 00:00:00-05'),
            audit_log_p2023 FOR VALUES FROM ('2023-01-01 00:00:00-05') TO ('2024-01-01 00:00:00-05'),
            audit_log_p2024 FOR VALUES FROM ('2024-01-01 00:00:00-05') TO ('2025-01-01 00:00:00-05'),
            audit_log_p2025 FOR VALUES FROM ('2025-01-01 00:00:00-05') TO ('2026-01-01 00:00:00-05'),
            audit_log_p2026 FOR VALUES FROM ('2026-01-01 00:00:00-05') TO ('2027-01-01 00:00:00-05'),
            audit_log_p2027 FOR VALUES FROM ('2027-01-01 00:00:00-05') TO ('2028-01-01 00:00:00-05'),

PUBLISHER:

keith_pub=# create table audit_log (id text, change_time timestamp with time zone);
CREATE TABLE
keith_pub=# create publication audit_pub for table public.audit_log;
CREATE PUBLICATION

SUBSCRIBER:

keith=# create subscription audit_sub connection 'host=localhost user=keith port=5433 dbname=keith_pub' publication audit_pub;
NOTICE:  created replication slot "audit_sub" on publisher
CREATE SUBSCRIPTION

PUBLISHER:

keith_pub=# insert into audit_log values (22, now());
insert into audit_log values (44, now());
INSERT 0 1
INSERT 0 1
keith_pub=# delete from audit_log;
ERROR:  cannot delete from table "audit_log" because it does not have a replica identity and publishes deletes
HINT:  To enable deleting from the table, set REPLICA IDENTITY using ALTER TABLE.

SUBSCRIBER:

keith=# select * from audit_log;
 id |          change_time          
----+-------------------------------
 22 | 2023-06-28 17:54:17.676579-04
 44 | 2023-06-28 17:54:17.679962-04
(2 rows)

So you can see the setup worked, but I got an error if I tried to delete anything on the publication side. However, when I add a primary key there, then it works ok

keith_pub=# alter table audit_log add primary key (id);
ALTER TABLE
keith_pub=# delete from audit_log;
DELETE 2

However, this brings up another issue that I think you will have as well. Inserts seem to replicate ok, but deletes do not. Running select again, you can see they're still there SUBSCRIBER:

keith=# select * from audit_log;
 id |          change_time          
----+-------------------------------
 22 | 2023-06-28 17:54:17.676579-04
 44 | 2023-06-28 17:54:17.679962-04

And trying to reinsert them again on the publisher works, but if you look in the subscriber logs, it's throwing a conflict error PUBLISHER:

keith_pub=# insert into audit_log values (22, now());
INSERT 0 1

SUBSCRIBER logs:

2023-06-28 18:01:30.846 EDT [2023-06-28 18:01:30 EDT]  [716489] user=,db=,app=,err=00000,sess=649cadba.aeec9: LOG:  logical replication apply worker for subscription     "audit_sub" has started
2023-06-28 18:01:30.846 EDT [2023-06-28 18:01:30 EDT]  [716489] user=,db=,app=,err=00000,sess=649cadba.aeec9: DEBUG:  connecting to publisher using connection string     "host=localhost user=keith port=5433 dbname=keith_pub" 
2023-06-28 18:01:30.853 EDT [2023-06-28 18:01:30 EDT]  [716489] user=,db=,app=,err=23505,sess=649cadba.aeec9: ERROR:  duplicate key value violates unique constraint      "audit_log_p2023_pkey"
2023-06-28 18:01:30.853 EDT [2023-06-28 18:01:30 EDT]  [716489] user=,db=,app=,err=23505,sess=649cadba.aeec9: DETAIL:  Key (id)=(22) already exists.

Not quite sure how to fix this and not sure if it's a bug or not. I'm actually surprised the inserts even work. My understanding is that logical pub/sub replication required fully matching schemas on both sides for the intended tables. If you set this logical replication up with a matching partition set on the publication side, I imagine this would work fine.

My suggestion would be to make matching partition sets on both the publisher and subscriber. However, you can set the retention on the publisher side to be lower so you don't have to keep quite as much data around there.

rsquaretrade commented 1 year ago

Thank you for taking the time and effort. On the publisher end, there is definitely a primary key(id). The schema is public. On the subscriber end which is partitioned, the primary key is in the template table. The schema is also public. The partition sets are identical with the exception of the primary key(id) being in the template table. And yes, the "REPLICA IDENTITY" error appears to be triggered by deletes and also updates.

I'll have to re-read your results as a second reading (or multiple) will shed some more light. Thanks again for your time.

rsquaretrade commented 1 year ago

Hello. I neglected to tell you an important replication setting. Any chance you can test replication on your end?

wal_level=logical on OLTP is what is used for the logical extraction. wal_level=replica is on DW, the subscriber.

keithf4 commented 1 year ago

This doesn't make any difference in this case. That controls the output of the WAL stream on the relevant system.

rsquaretrade commented 1 year ago

When you created the table in the subscriber, do you have to start with a fresh table (in which you did) or can I use 'LIKE' [the original table inheritance table] ?

CREATE TABLE IF NOT EXISTS public.audit_log_native (LIKE public.audit_log) PARTITION BY RANGE (change_time);

public.audit_log is set up as table Inheritance and I'm migrating to native. I would not think it would matter but I'm not 100% sure.

keithf4 commented 1 year ago

I don't see why you couldn't use a LIKE clause. The result of a table made like that is still a normal table. It's just copying the schema from the original table.

rsquaretrade commented 1 year ago

We have tested twice and everything works well in sandbox, we are still troubleshooting why the error shows up post- migration in prod (we've since rollback due to it).

Separate question, do you have documentation on migrating OLTP from declarative/table inheritance to native and using publish_via_partition_root=true to replicate to the warehoue?

keithf4 commented 1 year ago

I don't have anything specific for that, no.

rsquaretrade commented 1 year ago

Hi. Question, I have a table that has a composite primary key that includes the partition key. I created a native partitioned table and also added the composite primary key in the parent and it worked, so I did not need a template table. I also used create_parent and it added a template table as described in pg_partman.md doc. Since I don't need the template table, does it hurt to UPDATE part_config to point the p_template to the parent? Or could I delete the template table? Just wondering.

keithf4 commented 1 year ago

Hi. Question, I have a table that has a composite primary key that includes the partition key. I created a native partitioned table and also added the composite primary key in the parent and it worked, so I did not need a template table. I also used create_parent and it added a template table as described in pg_partman.md doc. Since I don't need the template table, does it hurt to UPDATE part_config to point the p_template to the parent? Or could I delete the template table? Just wondering.

I don't think I have things fully tested in 4.x without having the template table at least existing there, even if it doesn't use it. Version 5.x actually makes it fully optional, so I'd recommend just leaving it there for now until that new version is out.

rsquaretrade commented 1 year ago

Hi. FYI, I took your advise in above post and it's working well.

Different question. I'm migrating another table to native but the partition key is an integer (currently, we have an internal function that creates the monthly child tables like so: finance_p202301, for Jan, etc..) but we'd rather use pg_partman for maintenance. Could we use LIST or RANGE partitioning but how do we configure pg_partman? Thanks in advance.

keithf4 commented 1 year ago

The only method pg_partman supports with an integer for the column but time-based is epoch partitioning. See the flags in the functions related to that for how to do that. If your integer column is epoch based (seconds, milliseconds, or nanoseconds), you could convert it. If it's something else, it does not at this time.

Also, it only supports RANGE at this time.

And for future reference, please make a new entry in Discussions if you just have a question instead of replying to existing issues. Thank you!

rsquaretrade commented 1 year ago

Hi, will do. Is it possible to keep this ticket open? We are still testing the original problem, we should know the results next week and this ticket can be closed. I've appreciated all your input, and thank you very much.

keithf4 commented 1 year ago

Sure it can be left open

rsquaretrade commented 1 year ago

Hi. We found the root-cause/bug: "logical replication's checking of replica identity when the target table is partitioned". The fix is in v14.5 but the Warehouse is currently at PostgreSQL 14.4 (Ubuntu 14.4-1.pgdg20.04+1). We will proceed after we've upgraded to 14.8.

keithf4 commented 1 year ago

Oh wow! Thanks for letting me know on that one.