Closed dtseiler closed 3 years ago
Follow-up question: Does it pay to pre-create the partitions I know I'll need (eg set p_start_partition
back to May 2011)? Or should I just let partition_data_proc()
handle that? Does pre-creating them save any time/hassle for partition_data_proc()
?
What is the partition interval you're going for? If you're going with native partitioning, remember you can't have the batch size be any smaller than the partition size, so you'll have to partition by day. Going back that many years means a lot of child tables. If you're on PG12 or 13 I think you'll be fine. Before that saw some performance issues once child count got up around several thousand I think (don't quote me on that).
The main reason for the precreation is to avoid the contention of trying to create the child tables on demand as they're needed by the incoming data. Whether it's of any further benefit I think will depend on how much you're actually trying to use the data that's being moved while it's being moved around. There are some heavier locks taken when the child tables are created, so if they're all made beforehand, you avoid those locks while partitioning the data.
You say you need to keep the data available in the database, but do you actually need it available IN the actual partition table at all times? You could certainly keep older "archived" data in a separate table and just have your application be aware that if a certain column's date is older than say, 6 months, it should also check the archive table as well (data may be in transit in batches, so might have to check both locations if it's really close to the cutoff time).
So you could quickly try and get the most recent 6 months partitioned out daily into the currently "active" table. Then over a longer period of time move that old data to the "archive" table. Could have that archive table be partitioned by month or even year.
Just some ideas.
This is on PG 12.4.
My understanding was that I could have a custom batch size if I was processing data from another table (not from the default partition). My tests seem to confirm that as well. I have a table partitioned monthly and I'm calling partition_data_proc()
like this:
CALL partman.partition_data_proc(
p_parent_table := 'public.foo',
p_source_table := 'public.foo_old',
p_interval := '1 day',
p_batch := 180,
p_order := 'DESC'
);
I agree that daily partitions going that far back would not be great.
So I ran that call as I just posted and it took almost 5 hours in my quiet prod clone. The partitioned table has 3 non-unique indexes on it. I wonder if it's worth testing this import with no indexes on the old partitions. I would then add them afterward.
Question there is how best to add indexes to existing partitions. Do I have to make a clever script to do them all individually or is there something that can be done from the parent table or a pg_partman function?
Ahh yeah you can do the custom size if it's coming from a table that is not the default.
If it's a non-unique index, adding it to the parent should add it to all the children automatically. Not sure if that can be done concurrently or not. If it's unique and does not contain the partition key, that will be something you have to manually do. I tried adapting the old reapply_indexes.py script to work with native index inheritance, but it was just too messy trying to also account for what native does for you automatically.
Alright I'll play around with it and see. This table is only using non-unique indexes so no worries there.
If there is a concurrency problem, could I add indexes to the partition tables manually? Would adding the index to the parent then try to add new indexes on top of those? Or recognize that they already exist?
All partitions prior to the current month are always read-only (it's all insert-only data).
I believe adding an index to the parent automatically tries to add it to all children. If it already exists on the children, I'm not sure what it does
Alright I'll play around with this and report back.
The docs do say this:
Concurrent builds for indexes on partitioned tables are currently not supported. However, you may concurrently build the index on each partition individually and then finally create the partitioned index non-concurrently in order to reduce the time where writes to the partitioned table will be locked out. In this case, building the partitioned index is a metadata only operation.
I wonder if you have to use a specific naming convention for it to recognize this. I'll test later.
I noticed that the indexes that are created on partitions don't necessarily follow the name given on the parent. They rename them to use the columns involved. Is that something pg_partman
is doing, or is that postgres?
e.g. I created this table on my parent partitioned table:
"user_aggregations_foo_idx" btree (device_id, date, user_id)
but on the partitions, it doesn't use the foo
naming at all:
"user_aggregations_p2020_07_device_id_date_user_id_idx" btree (device_id, date, user_id)
(quick update): Looks like the naming truncates after a certain point, seems to be 32 characters:
"user_aggregations_p2020_09_primary_international_id_date_us_idx" btree (primary_international_id, date, user_id)
Test so far seem good. If I pre-create the indexes on partitions, it doesn't seem to care what the name is when I create the index on the parent table. It looks like it recognizes that an index is there and it doesn't need to do anything. I'll try to make a write-up for this later with some example code.
I doubt it's any naming convention to see if an index already exists. You can look up in the catalogs which columns an index is on so that's all it's probably doing.
PostgreSQL has a 64 byte (not character) limit on object names and it will truncate anything after that. pg_partman does manage that for the child table names to ensure a suffix is always the last thing, but it doesn't care about index names.
pg_partman is doing nothing at all with indexes with native partitioning unless you're using the template system.
So does pg_partman
basically take the name, truncate to 60 chars and then append _idx
? Just curious although it doesn't seem like I have to use that same column-name-based naming convention. Running a couple more tests this morning.
Looks like you're right.
create_parent()
to pre-create partitions. No indexes yet.concurrently
on individual partitions with the foo
name, eg user_aggregations_p2019_11_foo_idx
bar
name. No duplicate effort is made.test=# \d user_aggregations
Partitioned table "public.user_aggregations"
Column | Type | Collation | Nullable | Default
--------------------------+--------------------------+-----------+----------+---------
user_id | character varying(36) | | not null |
primary_international_id | character varying(36) | | not null |
device_id | character varying(36) | | not null |
date | timestamp with time zone | | not null |
Partition key: RANGE (date)
Indexes:
"user_aggregations_bar_idx" btree (primary_international_id, date, user_id)
"user_aggregations_date_idx" btree (date DESC)
Number of partitions: 34 (Use \d+ to list them.)
test=# \d user_aggregations_p2019_11
Table "public.user_aggregations_p2019_11"
Column | Type | Collation | Nullable | Default
--------------------------+--------------------------+-----------+----------+---------
user_id | character varying(36) | | not null |
primary_international_id | character varying(36) | | not null |
device_id | character varying(36) | | not null |
date | timestamp with time zone | | not null |
Partition of: user_aggregations FOR VALUES FROM ('2019-11-01 00:00:00+00') TO ('2019-12-01 00:00:00+00')
Indexes:
"user_aggregations_p2019_11_date_idx" btree (date DESC)
"user_aggregations_p2019_11_foo_idx" btree (primary_international_id, date, user_id)
Created a howto doc for converting a normal table to a natively partitioned table
https://github.com/pgpartman/pg_partman/blob/master/doc/pg_partman_howto_native.md
Offshooting this from Issue #325
Scenario is I have a table that we want to partition. The table has over 1.4 BILLION-with-a-B rows. Typically queries only go back 6 months but for whatever business reasons we have to keep all of it (going back almost 10 years) in the DB (we're working that out in a different effort).
My original thought process was that we need to partition it but make the old data available right away, so I looked at either trigger/inheritance based or native based and swapping the old table into the default partition. Based on your mention of potential write performance impact, we're shying away from the trigger method. Native based with default is a problem if we have to migrate data based on partition interval batches, which would be 1 month which would be a lot.
I broke my head out of that mold and think if I just rename the old table and create a new partitioned table and then migrated data into it, that might work for us. That allows me to use custom (eg 1 day) batch intervals to move the data. The app teams says they'd need the last 6 months of data ASAP, so I'd
call partition_data_proc
withp_source_table := 'renamed_orig_table_name'
,p_interval := '1 day'
andp_batch := 180
(for 180 days or roughly 6 months) andp_order := 'DESC'
to have start with the most recent data.Does this make sense or am I misunderstanding something. I'm going to be setting up a test case later today but wanted to see if I'm on the right track.