citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.46k stars 663 forks source link

Dropping partition sets permanent lock & the partition never drops #5108

Closed xenatisch closed 3 years ago

xenatisch commented 3 years ago

Suppose we have a table defined as follows:

CREATE TABLE IF NOT EXISTS sample_dist
(
    distribution_id    VARCHAR(26)               NOT NULL,
    partition_id       VARCHAR(26)               NOT NULL,
    date               TIMESTAMP WITH TIME ZONE,
    payload            JSONB                     DEFAULT '{"value": null}',

    PRIMARY KEY (distribution_id, partition_id)

)
PARTITION BY LIST ( partition_id );

SELECT create_distributed_table('sample_dist', 'distribution_id');

and a bunch of indices.

We then create a partition:

CREATE TABLE IF NOT EXISTS sample_dist__determinant
PARTITION OF sample_dist ( partition_id )
FOR VALUES IN ('determinant');

and load it with data.

We then delete everything:

DELETE FROM sample_dist__determinant WHERE partition_id = "determinant";

Now, when I try to delete the partition using the following command:

DROP TABLE sample_dist__determinant;

a lock is set on the database, and the process keeps on going, even if I break the process mid-run.

Running

SELECT blocked_locks.pid     AS blocked_pid,
     blocked_activity.usename  AS blocked_user,
     blocking_locks.pid     AS blocking_pid,
     blocking_activity.usename AS blocking_user,
     blocked_activity.query    AS blocked_statement,
     blocking_activity.query   AS current_statement_in_blocking_process
FROM  pg_catalog.pg_locks         blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity  ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks         blocking_locks
    ON blocking_locks.locktype = blocked_locks.locktype
    AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
    AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
    AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
    AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
    AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
    AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
    AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
    AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
    AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
    AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;

yields something like the follows:

[
  {
    "blocked_pid": 19242,
    "blocked_user": "citus",
    "blocking_pid": 24513,
    "blocking_user": "postgres",
    "blocked_statement": "drop table sample_dist__determinant",
    "current_statement_in_blocking_process": "SELECT master_update_table_statistics(logicalrelid) FROM pg_dist_partition"
  }
]

and a few hundred of these seemingly self-blocking statements with varying PIDs:

[
  {
    "blocked_pid": 24858,
    "blocked_user": "postgres",
    "blocking_pid": 25829,
    "blocking_user": "postgres",
    "blocked_statement": "SELECT master_update_table_statistics(logicalrelid) FROM pg_dist_partition",
    "current_statement_in_blocking_process": "SELECT master_update_table_statistics(logicalrelid) FROM pg_dist_partition"
  }
]

The lock prevents almost all operations on the database, so the only way out would be to run:

SELECT pg_terminate_backend(<PID>);

If, however, I try the drop statements in sequential mode, such as follows:

BEGIN;
SET citus.multi_shard_modify_mode to 'sequential';
DROP TABLE sample_dist__determinant;
COMMIT;

The drop statement still blocks forever, but the lock PID is removed immediately when I break the process and run ABORT;.

Might be worth highlighting that we have ~2000 partitions and some 6.5 billion records in the database. Though this particular partition is empty!

I guess the ultimate issue here is that I can't delete the partition.

onderkalaci commented 3 years ago

this might have been fixed on Citus 10.0.3 via https://github.com/citusdata/citus/pull/4752

What version are you using? Output of SELECT citus_version(); shows the exact version

xenatisch commented 3 years ago

Oh, sorry, forgot to add that part:

Postgres version:

PostgreSQL 11.10 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit

Platform:

Azure Database for PostgreSQL - Hyperscale

Citus version:

Citus Enterprise 9.4.2 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit gitref: v9.4.2
xenatisch commented 3 years ago

So, for the purpose of my work not being stuck forever, this is what I've done as a workaround - in case someone is experiencing a similar issue:

ALTER TABLE covid19.time_series
    DETACH PARTITION sample_dist__determinant;

ALTER TABLE sample_dist__determinant RENAME TO sample_dist__determinant_void;

This allows me to proceed and recreate the partition with the correct list value (was wrong before). Still can't delete the partition, but at least I can make progress.

onderkalaci commented 3 years ago

I verified that there are no deadlocks on Citus 10.0.3. However, the citus_update_table_statistics fails, see https://github.com/citusdata/citus/issues/5116

Though, still from the caller of the DROP TABLEs perspective, there is no problem. The problem is the CloudPlane/Monitor node, which does the citus_update_table_statistics. They need to re-run the command to get the statistics.

In that regard, I'm planning to close this issue and focus on #5116 if that makes sense to you as well @xenatisch?

xenatisch commented 3 years ago

Sure. Unfortunately, I can't upgrade Citus version on Azure. It is my understanding that they are planning to do so later in the year though.

Any plans for a patch in version 9?

Feel free to close the issue either way, and thanks for the support.

onderkalaci commented 3 years ago

Any plans for a patch in version 9?

Let me discuss this with the team. I'll update the ticket once I learn back.

onderkalaci commented 3 years ago

Hi @xenatisch,

We are backporting the changes to the 9.4 and 9.5 releases. But, it'd probably take some time until the changes are available on HSC.

I'll follow-up with the team to speed-up the process as much as possible.