citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.63k stars 670 forks source link

pg_partman run_maintenance() Fails Due to Parallel Operation on Distributed Table #7730

Open Apexample opened 1 week ago

Apexample commented 1 week ago

I've encountered an issue with pg_partman when running maintenance on a distributed table. The maintenance process fails with the following error message:

SQL Error [P0001]: ERROR: cannot modify table "xxx" because there was a parallel operation on a distributed table in the transaction
CONTEXT: SQL statement "ALTER TABLE public.xxx ATTACH PARTITION public.xxx_p20250101 FOR VALUES FROM ('2025-01-01 02:00:00+02') TO ('2025-02-01 02:00:00+02')"

Steps to Reproduce:

Run the run_maintenance() function on a distributed table that is subject to parallel operations.

Observe the error message mentioned above.

Workaround: I've found that setting citus.multi_shard_modify_mode to 'sequential' before running the maintenance function, and then reverting it to 'parallel' afterwards, resolves the issue. However, this workaround feels a bit hacky and not sure if this is the most elegant solution here.

Additional Information:

Could you please advise on a better approach to handle pg_partman maintenance in the presence of parallel operations, or if there are any planned fixes for this issue?

Thank you for your assistance.