Open odscjames opened 9 months ago
As an update, the frequency of the SSL SYSCALL error: EOF detected
errors which are caused by long running queries are variable, but (at the moment) not that rare:
In October 2024, the psycopg2.OperationalError: SSL SYSCALL error: EOF detected
exception was thrown on average once per day.
In November (so far), it has been thrown once every other day.
Brief Description We think we have seen connections go away, and we think the cause is a really long operation elsewhere that mean that no SQL was run for ages, thus causing the server to hang up the connection.
Severity Low (as rarely happens)
Solutions Should we set keepalives & keepalives_idle? It may be on by default, but on a very high default. Maybe we should turn keepalives_idle down?
And maybe keepalives_interval and keepalives_count too - just to make sure we have good values on them, rather than relying on defaults.
https://www.postgresql.org/docs/16/libpq-connect.html
Related This causes other problems in the code - https://github.com/IATI/refresher/issues/316