Open jnels124 opened 8 months ago
Is my understanding correct that the table (here public.transaction
) is a plain table on the source database, but a partitionned-table (and the parent table) on the target database?
The same problem we have encountered while using pgcopydb for Postgres to Timescale migration as well. The source would have plain Postgres table, but the same table on target will be partitioned using Postgres inheritance.
I've seen this also, very rare but it happens. I'm having trouble reproducing it with my own smaller example however to share.
I think you need to split the schema and data parts of the processing, and then tweak the schema on the target database to introduce partitioning, in order to repro. See https://pgcopydb.readthedocs.io/en/latest/tutorial.html#how-to-edit-the-schema-when-copying-a-database for details.
Now if you can reproduce, the way to fix it might be to decide if we can use TRUNCATE or TRUNCATE ONLY by having a look at the TARGET schema to see if the table is partitioned there, whatever it might be on the source. At the moment we only have catalog entries for tables on the sourceDB, we need to grab them at the right point in time for the targetDB too now.
@dimitri If I understand correctly, we should TRUNCATE
if the table on the target is partitioned but not on the source. If the table remains partitioned on both the source and target, we must stick to TRUNCATE ONLY
, correct?
We have many cases to consider, but today only consider a few. The deal with TRUNCATE is how we use it in the same transaction as the COPY statement, in order to be able to use the COPY FREEZE option. In the case when either the table is not partitioned, or partitioned exactly the same way on the source and the target, that's fine. In any other case, we should disable this optimization.
Here some details of the reasoning:
Then we might want to also have support for a change of partitioning scheme between source and target, including more partitioning levels (sub-partitions). In which case we want to route everything to the new parent table of course. And again, implement TRUNCATE on the parent table on the target, allowing Postgres to descend in the hierarchy.
The simplest approach for now would consist of assuming we target the same partitioning schema on source and target when given a full clone command with --drop-if-exists
. A second step may include checking for the same partitioning scheme on both source and target in cases when the schema might have been tampered with on the target, but waiting for that we should certainly revisit our approach and have the target Postgres instance route the data in the partitions, and have pgcopydb target the parent table only.
I am also getting ERROR: cannot truncate only a partitioned table
for the case where source and target schemas are exactly the same. (In fact, target was empty of any schemas). The table partitions are created correctly.
However, I think the issue is that the partitions are nested. For example:
tbl_sales
partitioned by category,
tbl_sales_retail
partitioned by year,
tbl_sales_retail_2023
partitioned by month, and
tbl_sales_retail_202312
has the data, not partitioned.
I see the error when TRUNCATE ONLY
is applied to the two middle nested partition levels.
UPDATE:
I was finally able to work around this issue by adding all the parent and nested parent partition tables to the [exclude-table-data]
section of the filter file. This allows the partition hierarchy to be copied correctly from source to target, but avoids any TRUNCATE ONLY
on the intermediate parent tables.
Side note: Probably documented somewhere, but learned you cannot have blank lines within any one of the ini filter sections. (Was adding blank lines for readability since the exclude list was lengthy).
I am attempting to use pgcopydb to migrate data from our current schema to our new schema which uses citus. When executing
pgcopydb copy table-data --table-jobs 8 --index-jobs 2
errors are logged indicating thattruncate only
was called on the parent table for a partitioned table. Truncate only is only valid to execute on specific partitions but not the parent table.