pgpartman / pg_partman

Partition management extension for PostgreSQL
Other
2.09k stars 281 forks source link

partition_data_proc failing with overlap error #373

Closed jarrettprosser closed 3 years ago

jarrettprosser commented 3 years ago
I have a postgres 12.5 database with a table which was set up with native daily partitioning using pg_partman 4.4.0, but the background worker was not enabled in the configuration. As a result, we have 9 partitions (the data the database was create and 4 days either side) plus the default partition. As this has been running for a month with new data coming in, the default partition is now large compared to the daily partitions. This is the current list of partitions: Table Name Rows Partition Expression
meter_pq_default 1060653 DEFAULT
meter_pq_p2021_06_04 0 FOR VALUES FROM ('2021-06-04 10:00:00+10') TO ('2021-06-05 10:00:00+10')
meter_pq_p2021_06_05 0 FOR VALUES FROM ('2021-06-05 10:00:00+10') TO ('2021-06-06 10:00:00+10')
meter_pq_p2021_06_06 0 FOR VALUES FROM ('2021-06-06 10:00:00+10') TO ('2021-06-07 10:00:00+10')
meter_pq_p2021_06_07 0 FOR VALUES FROM ('2021-06-07 10:00:00+10') TO ('2021-06-08 10:00:00+10')
meter_pq_p2021_06_08 0 FOR VALUES FROM ('2021-06-08 10:00:00+10') TO ('2021-06-09 10:00:00+10')
meter_pq_p2021_06_09 0 FOR VALUES FROM ('2021-06-09 10:00:00+10') TO ('2021-06-10 10:00:00+10')
meter_pq_p2021_06_10 0 FOR VALUES FROM ('2021-06-10 10:00:00+10') TO ('2021-06-11 10:00:00+10')
meter_pq_p2021_06_11 57272 FOR VALUES FROM ('2021-06-11 10:00:00+10') TO ('2021-06-12 10:00:00+10')
meter_pq_p2021_06_12 68308 FOR VALUES FROM ('2021-06-12 10:00:00+10') TO ('2021-06-13 10:00:00+10')

I am attempting to clean up the partitioning and enable the background worker so that partitions are maintained correctly. I have enabled the background worker, but it fails with the error:

2021-07-09 00:10:31.768 GMT [180] ERROR:  updated partition constraint for default partition would be violated by some row
    CONTEXT: SQL statement "ALTER TABLE public.meter_pq ATTACH PARTITION public.meter_pq_p2021_06_13 FOR VALUES FROM ('2021-06-13 00:00:00+00') TO ('2021-06-14 00:00:00+00')"

I understand from this issue that this is because there is data in the default partition which should be in the newly created partition. I need to manually run partition_data_proc on the table to migrate the data from default to the correct partitions. When I do so, though, I get this error:

SQL Error [P0001]: ERROR: partition "meter_pq_p2021_06_13" would overlap partition "meter_pq_p2021_06_12"
CONTEXT: SQL statement "ALTER TABLE public.meter_pq ATTACH PARTITION public.meter_pq_p2021_06_13 FOR VALUES FROM ('2021-06-13 00:00:00+10') TO ('2021-06-14 00:00:00+10')"

The syntax for this new partition seems correct - the values are similar to those for the existing partitions. Do I need to configure something differently to create these new partitions and clean up my data?

Thanks!

Edit: database is postgres 12.5, not 12.2 as originally described.

keithf4 commented 3 years ago

What timezone are you in and when does DST occur in your timezone? Just had two other issues recently where something like this was happening and it came down to crossing DST boundaries, so want to try and eliminate that as a possible issue.

jarrettprosser commented 3 years ago

I am in Australian Eastern Standard Time (Australia/Melbourne) - DST ended on Sunday, 4 April 2021 and will start again on Sunday, 3 October 2021. All of the partitions for this data (for now) would be in the period with no DST.

keithf4 commented 3 years ago

Ok. Normally the partition_data_proc() should take care of moving the data out to a temporary table then move it back in. Can you give me the full command with all the options you're giving to it? Also the min & max values that are in your default partition?

jarrettprosser commented 3 years ago

Sure thing! I'm running

call partman.partition_data_proc('public.meter_pq');

The table is partitioned by the field time_key. The max time_key value in the default partition is 2021-07-09 11:32:00, the min is 2021-06-13 10:00:00.

keithf4 commented 3 years ago

Ok. Not sure when I'll get to testing it out. But i think i have what i need to reproduce the situation at least.

jarrettprosser commented 3 years ago

I've had more of a look into this and I think I might have an answer. It looks like clients using the postgres JDBC driver inherit the timezone of the local machine. First, I've set up a Dockerfile to install partman into the bitnami/postgres container.

ARG PG_VERSION=12.7.0
FROM bitnami/postgresql:${PG_VERSION}
ARG PARTMAN_VERSION=4.5.1
USER root
RUN apt-get update && apt-get install -y build-essential
RUN curl -L "https://github.com/pgpartman/pg_partman/archive/refs/tags/v${PARTMAN_VERSION}.tar.gz" --output "pg_partman.tar.gz"
RUN tar -xvf pg_partman.tar.gz && cd pg_partman-${PARTMAN_VERSION} && make install
USER 1001

I also set up partition_internal.sql to create a simple table, partition it, and report the partitions:

-- configure partman
CREATE SCHEMA partman;
CREATE EXTENSION pg_partman SCHEMA partman;

CREATE ROLE partman WITH LOGIN;
GRANT ALL ON SCHEMA partman TO partman;
GRANT ALL ON ALL TABLES IN SCHEMA partman TO partman;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA partman TO partman;
GRANT EXECUTE ON ALL PROCEDURES IN SCHEMA partman TO partman;  -- PG11+ only
GRANT ALL ON SCHEMA public TO partman;

-- create a table
CREATE TABLE public.internal (
    device_key int4 NOT NULL,
    time_key timestamptz NOT NULL,
    measurement float4 NULL,
    CONSTRAINT internal_pkey PRIMARY KEY (device_key, time_key)
)
PARTITION BY RANGE (time_key);

-- partition the table
SELECT partman.create_parent('public.internal','time_key','native','daily');

-- inspect the partitions for the table
select pt.relname as partition_name,
       pg_get_expr(pt.relpartbound, pt.oid, true) as partition_expression
from pg_class base_tb 
  join pg_inherits i on i.inhparent = base_tb.oid 
  join pg_class pt on pt.oid = i.inhrelid
where base_tb.oid = 'public.internal'::regclass
order by partition_name;

Then I run setup.sh to build the container, and run it, and then execute the script using the psql application inside the database container:

#!/bin/bash
PG_VERSION=12.5.0
PARTMAN_VERSION=4.4.0
PG_IMAGE=postgres:pg${PG_VERSION}-partman${PARTMAN_VERSION}
# build the image
docker build . --build-arg PG_VERSION=${PG_VERSION} --build-arg PARTMAN_VERSION=${PARTMAN_VERSION} -t ${PG_IMAGE}
# start the db
docker run --rm -d --name pg -e POSTGRESQL_PASSWORD=password123 -p 5432:5432 ${PG_IMAGE}
# wait for postgres to start up
sleep 5
# run partition script
docker cp ./partition_internal.sql pg:/tmp/partition_internal.sql
docker exec -e PGPASSWORD=password123 pg psql -h localhost -d postgres -U postgres -f /tmp/partition_internal.sql

The output from psql looks correct here:

    partition_name    |                           partition_expression                           
----------------------+--------------------------------------------------------------------------
 internal_default     | DEFAULT
 internal_p2021_07_25 | FOR VALUES FROM ('2021-07-25 00:00:00+00') TO ('2021-07-26 00:00:00+00')
 internal_p2021_07_26 | FOR VALUES FROM ('2021-07-26 00:00:00+00') TO ('2021-07-27 00:00:00+00')
 internal_p2021_07_27 | FOR VALUES FROM ('2021-07-27 00:00:00+00') TO ('2021-07-28 00:00:00+00')
 internal_p2021_07_28 | FOR VALUES FROM ('2021-07-28 00:00:00+00') TO ('2021-07-29 00:00:00+00')
 internal_p2021_07_29 | FOR VALUES FROM ('2021-07-29 00:00:00+00') TO ('2021-07-30 00:00:00+00')
 internal_p2021_07_30 | FOR VALUES FROM ('2021-07-30 00:00:00+00') TO ('2021-07-31 00:00:00+00')
 internal_p2021_07_31 | FOR VALUES FROM ('2021-07-31 00:00:00+00') TO ('2021-08-01 00:00:00+00')
 internal_p2021_08_01 | FOR VALUES FROM ('2021-08-01 00:00:00+00') TO ('2021-08-02 00:00:00+00')
 internal_p2021_08_02 | FOR VALUES FROM ('2021-08-02 00:00:00+00') TO ('2021-08-03 00:00:00+00')

Then I connect to the container with an external client (DBeaver) and execute very similar SQL to create and partition a new table:

-- create a table
CREATE TABLE public.external (
    device_key int4 NOT NULL,
    time_key timestamptz NOT NULL,
    measurement float4 NULL,
    CONSTRAINT external_pkey PRIMARY KEY (device_key, time_key)
)
PARTITION BY RANGE (time_key);

-- partition the table
SELECT partman.create_parent('public.external','time_key','native','daily');

I exec into the container and run psql again to check this new table:

docker exec -it -e PGPASSWORD=password123 pg psql -h localhost -d postgres -U postgres
postgres=# select pt.relname as partition_name,
       pg_get_expr(pt.relpartbound, pt.oid, true) as partition_expression
from pg_class base_tb 
  join pg_inherits i on i.inhparent = base_tb.oid 
  join pg_class pt on pt.oid = i.inhrelid
where base_tb.oid = 'public.external'::regclass;

These partitions have a start and end offset that corresponds with my local machine's timezone.

    partition_name    |                           partition_expression                           
----------------------+--------------------------------------------------------------------------
 external_default     | DEFAULT
 external_p2021_07_25 | FOR VALUES FROM ('2021-07-24 14:00:00+00') TO ('2021-07-25 14:00:00+00')
 external_p2021_07_26 | FOR VALUES FROM ('2021-07-25 14:00:00+00') TO ('2021-07-26 14:00:00+00')
 external_p2021_07_27 | FOR VALUES FROM ('2021-07-26 14:00:00+00') TO ('2021-07-27 14:00:00+00')
 external_p2021_07_28 | FOR VALUES FROM ('2021-07-27 14:00:00+00') TO ('2021-07-28 14:00:00+00')
 external_p2021_07_29 | FOR VALUES FROM ('2021-07-28 14:00:00+00') TO ('2021-07-29 14:00:00+00')
 external_p2021_07_30 | FOR VALUES FROM ('2021-07-29 14:00:00+00') TO ('2021-07-30 14:00:00+00')
 external_p2021_07_31 | FOR VALUES FROM ('2021-07-30 14:00:00+00') TO ('2021-07-31 14:00:00+00')
 external_p2021_08_01 | FOR VALUES FROM ('2021-07-31 14:00:00+00') TO ('2021-08-01 14:00:00+00')
 external_p2021_08_02 | FOR VALUES FROM ('2021-08-01 14:00:00+00') TO ('2021-08-02 14:00:00+00')

If I connect to the database with psql running locally and execute the same SQL, there is no time offset. I also tried a couple of additional containers with different timezones set, running psql to create the tables and partitions, and there was no offset. Connecting with a different database client (Beekeeper Studio) creates partitions the same as psql.

Our application and DBeaver both use the Postgres JDBC driver to connect to the Postgres database. According to this the JDBC driver will use the client's JVM timezone when the JDBC driver sets or retrieves time related data. When we create the partitions initially they are made with a timezone offset, but when we run partition_data_proc it's not using the timezone offset. That's what's creating the partition overlap in the original issue report.

keithf4 commented 3 years ago

Glad you were able to figure this out! Sorry I didn't get a chance to investigate more myself. Timezones and DST seem to be the bane of partitioning lately going by several other issues that have been opened recently.

Is there any further assistance you need with this?

jarrettprosser commented 3 years ago

No worries, it was bugging me that I was getting those time offsets when creating the initial partitions so I wanted to try a few more things out.

Further assistance is up to you, I suppose! It would be good if partman would create consistent partition boundaries regardless of the SQL client. Alternatively, if there are existing partitions with a timezone offset, perhaps it could use that offset when creating new partitions with maintenance? I imagine either of those options would be fairly complicated to implement for a narrow edge case like this.

In the meantime, I'll try overriding the timezone in the JDBC connection so that everything's created in the same timezone as the DB, which should let me get the background worker again. I'm happy if you'd like to close this issue as that solution isn't within the scope of the partman project.

Thanks!

keithf4 commented 3 years ago

Yeah unfortunately that's a bit more complicated than I'd like to get and would require a lot of testing to make sure there's not any odd side effects. Especially when the solution without that fix is to just ensure that the initial setup and following maintenance runs are run within the same timezone.

After all these recent issues, I think the docs now deserve a section somewhere to discuss best practices with timezones and DST. So I do appreciate you opening the issue and pointing out this particular situation. I'll be sure to include it.

jarrettprosser commented 3 years ago

Understood. Thanks for the help!