Closed adrianlzt closed 1 year ago
:+1: +1 for this.
We've encountered this same issue and have a couple of workarounds but would love to see better support for this timezone change. The way that we have approached this so far is either:
So I'll leave this issue open for a while to see if anyone has any other suggestions, but @cj13579's two suggestions are really the only way I can think to handle this if you must use hourly partitioning. I would make sure that you have your premake set high enough that it can make the necessary child tables for the DST change before that data can possibly start coming in.
And reading back I just noticed you had your premake set to 1. That is most definitely not going to work well trying to handle DST. The default premake is 4 and I believe should alleviate some of the problem. Can set it even higher if you have "future" data coming in relative to the current time. There's little to no penalty for having several empty partitions ahead of your current data timeframe.
My first example was showing that creating hourly partitions was not working properly, I set premake=1 because it does not affect for that particular case, when the partitions were being created between 2020-10-25 00:00:00+02 and now.
Yeah, UTC is a good option. Maybe another one is to use timezone or unix epoch in the partition names and for the time iteration.
I see yeah. I've dealt with DST changes before in another extension and it was not particularly easy ( just don't do anything - https://github.com/keithf4/mimeo/blob/master/doc/mimeo.md). I think it would be even more challenging here because people can literally choose any partition interval they want, so figuring out the proper boundary values in all cases may not work.
But if there's anyone out there that's more experienced with managing time/DST in this case, I'd be happy to hear suggestions.
Before using pgpartman I did an small POC to create partitions, ad-hoc for our use case of Zabbix.
https://gitlab.opensolutions.cloud/adrian/zabbix-partition-manager/blob/master/zpartition.go
My approach was to use unix epoch and alignment using the epoch unix 0 and a lots of unit testing. But the partitions names are ugly.
Maybe forcing the partitioning to be UTC as suggested by @cj13579
We run into a similar issue.
Our tz is:
show timezone;
Europe/Berlin
The range column is created_at timestamptz NOT NULL DEFAULT NOW()
Partman config
SELECT partman.create_parent('webhook_logs.logs', 'created_at', 'native', 'weekly', p_template_table := 'webhook_logs.logs_template');
UPDATE partman.part_config SET infinite_time_partitions = false, premake = 8, retention = '4 weeks', retention_keep_table=false WHERE parent_table = 'webhook_logs.logs';
Partman will partition the table with overlaps as follow.
This results in errors when we insert data that call partman.partition_data_proc('webhook_logs.logs')
.
SQL Error [P0001]: ERROR: partition "logs_p2021w48" would overlap partition "logs_p2021w47"
CONTEXT: SQL statement "ALTER TABLE webhook_logs.logs ATTACH PARTITION webhook_logs.logs_p2021w48 FOR VALUES FROM ('2021-11-29 00:00:00+01') TO ('2021-12-06 00:00:00+01')"
PL/pgSQL function create_partition_time(text,timestamp with time zone[],boolean,text) line 239 at EXECUTE
...
This can be a workaround by using timestamp
as a partition column type. In that case, all partitions have set the correct boundaries and call partman.partition_data_proc('webhook_logs.logs')
will run successfully.
We could fix it. The issue is dbeaver that runs and displays everything in your local timezone. I couldn't find a way to set it off.
Be very careful using timestamp
vs timestamptz
. That almost never has the use-cases that people intend for it. It literally stores the time exactly as given with no context to what timezone the client nor server may be in.
Ex:
With timestamp
, when I insert 2021-10-29 12:55:36
, that is the literal time value that is stored. No matter who queries it from whatever timezone they are in, it always returns exactly 12:55:36
, with no indication of what timezone that time is relative to. For this to work properly, every single system in your entire environment that ever interacts with the database must be the same timezone or be aware of exactly which timezone all the data that is being inserted into the database is. If not, and someone thinks it's EST and another person thinks its PST, one of them will be wrong and get the wrong data back. Or they could both be wrong.
With timestamptz
, the time is always stored in UTC and adjusted accordingly depending on what timezone the client is in. So if I insert 2021-10-29 12:55:36
and I'm in eastern time, then someone querying that data from central time will get the proper value of 2021-10-29 11:55:36
.
So my recommendation is to always use timestamptz
unless you've very carefully proved your use-case for timestamp
in your environment.
Thanks for the info!
So my recommendation is to always use timestamptz unless you've very carefully proved your use-case for timestamp in your environment.
Exactly, in that specific case, the timezone would be interpreted as UTC only and only be used for partitioning. But the issue was solved so we use timestamptz
for now :smile:
I'm going to close this issue for now. I think the best advice really is to run the system in UTC if you're going to be doing partitioning schemes that would have issues with DST.
create_parent
does not create partitions correctly with time change happens from summer to winter.In human readable format:
So there is a hole (no partition) between 1603587600 (02:00:00 CET) and 1603591200 (03:00:00 CET).
With the summer time change, there are no holes, just a jump in naming:
A related error happens when
partition_data_proc
is called and there is data in 2:00 summer and in 2:00 winter.Looks like it is trying to push the 1603587601 (Sun Oct 25 02:00:01 CET 2020) into bugpartman_p2020_10_25_0200, that should have contraints
FROM (1603584000) TO (1603587600)
(Sun Oct 25 02:00:00 CEST 2020 -> Sun Oct 25 02:00:00 CET 2020)