Closed northrup closed 3 years ago
This sounds like you have statement_timeout set either globally or for whichever database/user you are running pg_repack
with. Can you try using psql
to log into the same database, using the same username as you used with pg_repack
, and check what:
SHOW statement_timeout;
says?
@schmiddy here's what you were asking for:
gitlabhq_production=# SHOW statement_timeout;
statement_timeout
-------------------
0
(1 row)
@schmiddy further though in the code for the preliminary_checks function you specifically establish a statement_timeout of 0 : https://github.com/reorg/pg_repack/blob/master/bin/pg_repack.c#L508
Is there anywhere else I could look on this?
Err yeah this is a bit of a head-scratcher. The error message you posted:
ERROR: canceling statement due to statement timeout
can only be coming from Postgres here.
In theory we intend to be setting statement_timeout to 0 in preliminary_checks as you posted, but I wonder if maybe we confuse ourselves in lock_access_share, where we first set then reset statement_timeout
(and same deal in lock_exclusive).
Either way, I think in those dubious RESET
s that we'd be RESET
'ing statement_timeout to whatever the default is for that connection/user/database/global setting -- you showed earlier that you think this should be set to 0 (disabled) by default, but perhaps that error was really hit with a different user or database? Or the unlucky user had some other way of turning on statement_timeout, e.g. coaxing libpq connection parameters like:
PGOPTIONS="-c statement_timeout=5min" pg_repack ...
I had no problems repacking tables up to 750M rows as of pg_repack 1.4.0 on Amazon RDS
Is the problem that the -k
option (as required to use on RDS) not only bypasses superuser checks, but essential setup such as disabling the statement timeout?
A largeer issue is the consequences of that -- I'm only just starting to understand how pg_repack works, but it some scenarios (a large table, with a high rate of updates), trying to select all the rows from one table an insert them into another in one transaction is bound to cause problems. Although it may avoid an exclusive lock as vacuum full
would take, the transaction will also prevent vacuum from removing dead tuples, and the consequent accumulation of dead tuples may degrade performance until Bad Things happen.
Statement_timeout is a good thing before running any postgres DDL operations(From documentation:https://www.postgresql.org/docs/9.4/runtime-config-client.html).
Here, before locking the table(for drop/swap) - pg_repack waits for existing operations/transactions to finish on a table. That's the reason it sets a timeout here before lock table statement. If all the ongoing transactions doesn't finish on the table then you receive the message: canceling statement due to statement timeout
Next time it tries again by increasing statement timeout. Likewise , it keeps repeating the same process multiple times until it is successfully able to lock the table. Once, it lock the table- it performs swap.
If we don't have statement_timeout or statement_timeout=0 - we might endup with missing some updates/operations for the tables.
I have a table that contains 25m rows, which I cannot trim. The repacking of this table timeouts with
ostensibly because pg_repack cannot select all 25m rows into the repack table under the timeout. The problem is I have set a timeout in the pg_repack command to a high value, and no matter what this transaction times out after 5 minutes of wall time. How can I successfully get this table repacked?