CrunchyData / postgres-operator

Production PostgreSQL for Kubernetes, from high availability Postgres clusters to full-scale database-as-a-service.
https://access.crunchydata.com/documentation/postgres-operator/v5/
Apache License 2.0
3.97k stars 595 forks source link

not able to connect through pgbouncer after increasing max_connections=200 #2002

Closed kseswar closed 4 years ago

kseswar commented 4 years ago

Hi there,

We have increased the max_connections=200 and deployed a postgres image in same cluster. Entered into the postgres image and trying to connect to the pgbouncer. But we are getting an error psql: error: could not connect to server: ERROR: pgbouncer cannot connect to server

Also tried providing Cluster IP instead of service name in host but i can see same error

Also tried to connect through my primary and replica ( hippo and hippo-replica respectively), we are able to connect. Note: my pgbouncer service name is hippo-pgbouncer.

This observed after changing the max_connections. is this because of changing it?

Please find the screenshots for the same. image

image

Please tell us about your environment:

Operating System: Linux based Where is this running ( Local, Cloud Provider) - GCP GKE cluster Storage being used (NFS, Hostpath, Gluster, etc):GCP PVC, Container Image Tag: centos7-4.3.0 / centos7-12.2-4.3.0 PostgreSQL Version:12.2 Platform (Docker, Kubernetes, OpenShift): Kubernetes Platform Version: 1.16.11-gke.5

jkatz commented 4 years ago
kseswar commented 4 years ago

Hi,

Please find the screenshots of pgbouncer logs and postgres db logs (this is the logs of the postgres image that i have deployed in the cluster and trying to connect through pgbouncer)

image

image

Kindly let me know if anything is required.

Regards, Keerthana

jkatz commented 4 years ago

Thanks. I still need to see the PostgreSQL server logs. Those reside in the /pgdata/<clustername/pg_log directory, and in particular, I would need to see the logs at the time you were attempting to connect to PostgreSQL via pgbouncer.

kseswar commented 4 years ago

Hi,

I would find the below logs under /pgdata/hippo/pglogs

2020-11-01 15:34:20.089 UTC [4016736] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off 2020-11-01 15:34:20.327 UTC [1241119] LOG: duration: 0.716 ms statement: SELECT pg_catalog.to_char(pg_catalog.pg_postmaster_start_time(), 'YYYY-MM-DD HH24:MI:SS.MS TZ'), CASE WHEN pg_catalog.pg_is_in_recovery() THEN 0 ELSE ('x' || pg_catalog.substr(pg_catalog.pg_walfile_name(pg_catalog.pg_current_wal_lsn()), 1, 8))::bit(32)::int END, CASE WHEN pg_catalog.pg_is_in_recovery() THEN 0 ELSE pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_current_wal_lsn(), '0/0')::bigint END, pg_catalog.pg_wal_lsn_diff(COALESCE(pg_catalog.pg_last_wal_receive_lsn(), pg_catalog.pg_last_wal_replay_lsn()), '0/0')::bigint, pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_last_wal_replay_lsn(), '0/0')::bigint, pg_catalog.to_char(pg_catalog.pg_last_xact_replay_timestamp(), 'YYYY-MM-DD HH24:MI:SS.MS TZ'), pg_catalog.pg_is_in_recovery() AND pg_catalog.pg_is_wal_replay_paused(), pg_catalog.array_to_json(pg_catalog.array_agg(pg_catalog.row_to_json(ri))) FROM (SELECT (SELECT rolname FROM pg_authid WHERE oid = usesysid) AS usename, application_name, client_addr, w.state, sync_state, sync_priority FROM pg_catalog.pg_stat_get_wal_senders() w, pg_catalog.pg_stat_get_activity(pid)) AS ri 2020-11-01 15:34:21.456 UTC [1241119] LOG: duration: 0.332 ms statement: SELECT CASE WHEN pg_catalog.pg_is_in_recovery() THEN 0 ELSE ('x' || pg_catalog.substr(pg_catalog.pg_walfile_name(pg_catalog.pg_current_wal_lsn()), 1, 8))::bit(32)::int END, CASE WHEN pg_catalog.pg_is_in_recovery() THEN GREATEST( pg_catalog.pg_wal_lsn_diff(COALESCE(pg_catalog.pg_last_wal_receive_lsn(), '0/0'), '0/0')::bigint, pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_last_wal_replay_lsn(), '0/0')::bigint)ELSE pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_current_wal_lsn(), '0/0')::bigint END, 0 2020-11-01 15:34:22.124 UTC [4016781] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off 2020-11-01 15:34:24.165 UTC [4016782] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off 2020-11-01 15:34:26.210 UTC [4016783] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off 2020-11-01 15:34:28.248 UTC [4016785] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off 2020-11-01 15:34:29.548 UTC [4016786] FATAL: no pg_hba.conf entry for host "10.124.21.220", user "pgbouncer", database "db", SSL off 2020-11-01 15:34:29.694 UTC [1241119] LOG: duration: 0.646 ms statement: SELECT pg_catalog.to_char(pg_catalog.pg_postmaster_start_time(), 'YYYY-MM-DD HH24:MI:SS.MS TZ'), CASE WHEN pg_catalog.pg_is_in_recovery() THEN 0 ELSE ('x' || pg_catalog.substr(pg_catalog.pg_walfile_name(pg_catalog.pg_current_wal_lsn()), 1, 8))::bit(32)::int END, CASE WHEN pg_catalog.pg_is_in_recovery() THEN 0 ELSE pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_current_wal_lsn(), '0/0')::bigint END, pg_catalog.pg_wal_lsn_diff(COALESCE(pg_catalog.pg_last_wal_receive_lsn(), pg_catalog.pg_last_wal_replay_lsn()), '0/0')::bigint, pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_last_wal_replay_lsn(), '0/0')::bigint, pg_catalog.to_char(pg_catalog.pg_last_xact_replay_timestamp(), 'YYYY-MM-DD HH24:MI:SS.MS TZ'), pg_catalog.pg_is_in_recovery() AND pg_catalog.pg_is_wal_replay_paused(), pg_catalog.array_to_json(pg_catalog.array_agg(pg_catalog.row_to_json(ri))) FROM (SELECT (SELECT rolname FROM pg_authid WHERE oid = usesysid) AS usename, application_name, client_addr, w.state, sync_state, sync_priority FROM pg_catalog.pg_stat_get_wal_senders() w, pg_catalog.pg_stat_get_activity(pid)) AS ri 2020-11-01 15:34:30.283 UTC [4016802] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off 2020-11-01 15:34:31.456 UTC [1241119] LOG: duration: 0.251 ms statement: SELECT CASE WHEN pg_catalog.pg_is_in_recovery() THEN 0 ELSE ('x' || pg_catalog.substr(pg_catalog.pg_walfile_name(pg_catalog.pg_current_wal_lsn()), 1, 8))::bit(32)::int END, CASE WHEN pg_catalog.pg_is_in_recovery() THEN GREATEST( pg_catalog.pg_wal_lsn_diff(COALESCE(pg_catalog.pg_last_wal_receive_lsn(), '0/0'), '0/0')::bigint, pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_last_wal_replay_lsn(), '0/0')::bigint)ELSE pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_current_wal_lsn(), '0/0')::bigint END, 0 2020-11-01 15:34:32.323 UTC [4016804] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off 2020-11-01 15:34:34.364 UTC [4016805] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off 2020-11-01 15:34:35.366 UTC [1241119] LOG: duration: 0.700 ms statement: SELECT pg_catalog.to_char(pg_catalog.pg_postmaster_start_time(), 'YYYY-MM-DD HH24:MI:SS.MS TZ'), CASE WHEN pg_catalog.pg_is_in_recovery() THEN 0 ELSE ('x' || pg_catalog.substr(pg_catalog.pg_walfile_name(pg_catalog.pg_current_wal_lsn()), 1, 8))::bit(32)::int END, CASE WHEN pg_catalog.pg_is_in_recovery() THEN 0 ELSE pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_current_wal_lsn(), '0/0')::bigint END, pg_catalog.pg_wal_lsn_diff(COALESCE(pg_catalog.pg_last_wal_receive_lsn(), pg_catalog.pg_last_wal_replay_lsn()), '0/0')::bigint, pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_last_wal_replay_lsn(), '0/0')::bigint, pg_catalog.to_char(pg_catalog.pg_last_xact_replay_timestamp(), 'YYYY-MM-DD HH24:MI:SS.MS TZ'), pg_catalog.pg_is_in_recovery() AND pg_catalog.pg_is_wal_replay_paused(), pg_catalog.array_to_json(pg_catalog.array_agg(pg_catalog.row_to_json(ri))) FROM (SELECT (SELECT rolname FROM pg_authid WHERE oid = usesysid) AS usename, application_name, client_addr, w.state, sync_state, sync_priority FROM pg_catalog.pg_stat_get_wal_senders() w, pg_catalog.pg_stat_get_activity(pid)) AS ri 2020-11-01 15:34:36.402 UTC [4016851] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off 2020-11-01 15:34:38.435 UTC [4016852] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off 2020-11-01 15:34:39.704 UTC [1241119] LOG: duration: 0.662 ms statement: SELECT pg_catalog.to_char(pg_catalog.pg_postmaster_start_time(), 'YYYY-MM-DD HH24:MI:SS.MS TZ'), CASE WHEN pg_catalog.pg_is_in_recovery() THEN 0 ELSE ('x' || pg_catalog.substr(pg_catalog.pg_walfile_name(pg_catalog.pg_current_wal_lsn()), 1, 8))::bit(32)::int END, CASE WHEN pg_catalog.pg_is_in_recovery() THEN 0 ELSE pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_current_wal_lsn(), '0/0')::bigint END, pg_catalog.pg_wal_lsn_diff(COALESCE(pg_catalog.pg_last_wal_receive_lsn(), pg_catalog.pg_last_wal_replay_lsn()), '0/0')::bigint, pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_last_wal_replay_lsn(), '0/0')::bigint, pg_catalog.to_char(pg_catalog.pg_last_xact_replay_timestamp(), 'YYYY-MM-DD HH24:MI:SS.MS TZ'), pg_catalog.pg_is_in_recovery() AND pg_catalog.pg_is_wal_replay_paused(), pg_catalog.array_to_json(pg_catalog.array_agg(pg_catalog.row_to_json(ri))) FROM (SELECT (SELECT rolname FROM pg_authid WHERE oid = usesysid) AS usename, application_name, client_addr, w.state, sync_state, sync_priority FROM pg_catalog.pg_stat_get_wal_senders() w, pg_catalog.pg_stat_get_activity(pid)) AS ri 2020-11-01 15:34:40.482 UTC [4016868] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off 2020-11-01 15:34:41.456 UTC [1241119] LOG: duration: 0.259 ms statement: SELECT CASE WHEN pg_catalog.pg_is_in_recovery() THEN 0 ELSE ('x' || pg_catalog.substr(pg_catalog.pg_walfile_name(pg_catalog.pg_current_wal_lsn()), 1, 8))::bit(32)::int END, CASE WHEN pg_catalog.pg_is_in_recovery() THEN GREATEST( pg_catalog.pg_wal_lsn_diff(COALESCE(pg_catalog.pg_last_wal_receive_lsn(), '0/0'), '0/0')::bigint, pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_last_wal_replay_lsn(), '0/0')::bigint)ELSE pg_catalog.pg_wal_lsn_diff(pg_catalog.pg_current_wal_lsn(), '0/0')::bigint END, 0 2020-11-01 15:34:42.525 UTC [4016869] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off 2020-11-01 15:34:44.553 UTC [4016870] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off 2020-11-01 15:34:44.592 UTC [4016871] FATAL: no pg_hba.conf entry for host "10.124.21.220", user "pgbouncer", database "db", SSL off 2020-11-01 15:34:46.591 UTC [4016872] FATAL: no pg_hba.conf entry for host "127.0.0.1", user "ccp_monitoring", database "postgres", SSL off

kseswar commented 4 years ago

Hi Team, Any update on this?

kseswar commented 4 years ago

Hi Team, Please let me know if there is any update

jkatz commented 4 years ago

The logs indicate that the PostgreSQL cluster lacks the ability to authenticate pgbouncer or ccp_monitoring, which can be a result of a variety of issues:

or something else.

Regardless, the error has nothing to do with max_connections.