Open rod-glover opened 1 year ago
This has been verified in two ways:
manage-views
(on a dead-end branch), which fixed the problem for those queries, which are very long-running.psql
connections from my work laptop, one with the recommended parameters, and one without. Each ran the query select pg_sleep(2.2*60*60);
(2.2 hr). The connection with parameters succeeded without error; the connection without got SSL SYSCALL error: Connection timed out
.
Note: This issue requires only a documentation change. Code does not need to be changed.
Something in the path to our database server, and/or the database engine itself, kills jobs that run longer than 120 minutes. This is the default keepalive time in both Linux and the database engine. One way to address this problem is to add keepalives on the client side at an interval shorter than the default (or whatever they happen to be set to be).
The PostgreSQL documentation for Database Connection Control Functions contains information on this. Note the following; this is a client-side setting, not a database engine setting:
The documentation lists the functions and the parameters that can be passed to them. These parameters are listed in section 34.1.2, and include the following keepalive parameters.
By setting these parameters to values that send keepalives at intervals shorter than the critical interval of 7200 s = 120 min, we can prevent the connection from being dropped.
Further, these parameters can be set in the database connection strings, which enables all clients to be set up appropriately without code change (which would be impossible with
psql
, for example). For example:The 300 s (5 min) interval is very conservative, but given the minimal overhead for keepalives, it should not cause problems.