darold / pgcluu

PostgreSQL Cluster performances monitoring and auditing tool
http://pgcluu.darold.net/
PostgreSQL License
336 stars 45 forks source link

A DROP TABLE can block pgcluu_collectd #129

Closed Krysztophe closed 3 years ago

Krysztophe commented 3 years ago

At the beginning of a long-running transaction, add a DROP TABLE [1]. This will keep a lock on some system tables for the whole transaction duration and stop all probes from pgcluu_collectd. (Even \d+ is stuck).

So I obtain some artefacts on ALL curves:

c

[1] The customer know that this is bad practice and will correct, but I've seen before in the wild.

The problem seems to be the query get_partitionned_tables , more precisely the call to pg_get_constraintdef(con.oid), that is stuck by the DROP TABLE.

I suggest that, after some timeout, the test gives up and we try to get at least other data.

I'm wondering what would be the best way to avoid this in a general way:

darold commented 3 years ago

Commit 1d7d0b3 adds a new command line option -t | --lock-timeout with default to 3 seconds to set lock timeout for any SQL query executed by pgcluu_collectd. Actually as pgcluu_collectd use psql to execute the queries this patch just set the environment variable PGOPTIONS with the lock_timeout setting.

Krysztophe commented 3 years ago

That seems to work, merci !!

Side effect : in "PostgreSQL non default settings", lock_timeout is now 3000 ms.

I don't think this is really a problem.

darold commented 3 years ago

Ah right, I've not though about that. The solution is to set PGOPTIONS after the setting collect, I will fix that.

darold commented 3 years ago

Commit 20823ae might fix the lock_timeout values grabbed from pg_settings.

Krysztophe commented 3 years ago

Thanks!