Closed xenatisch closed 3 years ago
this might have been fixed on Citus 10.0.3 via https://github.com/citusdata/citus/pull/4752
What version are you using? Output of SELECT citus_version();
shows the exact version
Oh, sorry, forgot to add that part:
Postgres version:
PostgreSQL 11.10 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit
Platform:
Azure Database for PostgreSQL - Hyperscale
Citus version:
Citus Enterprise 9.4.2 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit gitref: v9.4.2
So, for the purpose of my work not being stuck forever, this is what I've done as a workaround - in case someone is experiencing a similar issue:
ALTER TABLE covid19.time_series
DETACH PARTITION sample_dist__determinant;
ALTER TABLE sample_dist__determinant RENAME TO sample_dist__determinant_void;
This allows me to proceed and recreate the partition with the correct list value (was wrong before). Still can't delete the partition, but at least I can make progress.
I verified that there are no deadlocks on Citus 10.0.3. However, the citus_update_table_statistics
fails, see https://github.com/citusdata/citus/issues/5116
Though, still from the caller of the DROP TABLEs
perspective, there is no problem. The problem is the CloudPlane/Monitor node, which does the citus_update_table_statistics
. They need to re-run the command to get the statistics.
In that regard, I'm planning to close this issue and focus on #5116 if that makes sense to you as well @xenatisch?
Sure. Unfortunately, I can't upgrade Citus version on Azure. It is my understanding that they are planning to do so later in the year though.
Any plans for a patch in version 9?
Feel free to close the issue either way, and thanks for the support.
Any plans for a patch in version 9?
Let me discuss this with the team. I'll update the ticket once I learn back.
Hi @xenatisch,
We are backporting the changes to the 9.4 and 9.5 releases. But, it'd probably take some time until the changes are available on HSC.
I'll follow-up with the team to speed-up the process as much as possible.
Suppose we have a table defined as follows:
and a bunch of indices.
We then create a partition:
and load it with data.
We then delete everything:
Now, when I try to delete the partition using the following command:
a lock is set on the database, and the process keeps on going, even if I break the process mid-run.
Running
yields something like the follows:
and a few hundred of these seemingly self-blocking statements with varying PIDs:
The lock prevents almost all operations on the database, so the only way out would be to run:
If, however, I try the drop statements in
sequential
mode, such as follows:The drop statement still blocks forever, but the lock PID is removed immediately when I break the process and run
ABORT;
.Might be worth highlighting that we have ~2000 partitions and some 6.5 billion records in the database. Though this particular partition is empty!
I guess the ultimate issue here is that I can't delete the partition.