This is related to https://github.com/postgresml/pgcat/pull/822 - we were seeing this message when trialling PgCat in our production environment. We couldn't see why the Postgres server in question was down, and the answer is that it wasn't 🙂
Instead, we were queueing for longer than connect_timeout. When that happens in PgBouncer, you get this:
linear_production_copy=# SELECT 1;
FATAL: 08P01: query_wait_timeout
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
In PgCat, you get this:
linear_production_copy=# SELECT 1;
FATAL: 58000: could not get connection from the pool - AllServersDown
Which is partly right and partly misleading. I think PgCat should use a more specific error message in this case. I'm happy to create a PR if people agree.
I certainly agree with that.
There are several error messages around checkout and health checks that could be made more clear but we can start with this one.
This is related to https://github.com/postgresml/pgcat/pull/822 - we were seeing this message when trialling PgCat in our production environment. We couldn't see why the Postgres server in question was down, and the answer is that it wasn't 🙂
Instead, we were queueing for longer than
connect_timeout
. When that happens in PgBouncer, you get this:In PgCat, you get this:
Which is partly right and partly misleading. I think PgCat should use a more specific error message in this case. I'm happy to create a PR if people agree.