Vonng / pg_exporter

Advanced PostgreSQL & Pgbouncer Metrics Exporter for Prometheus
https://pigsty.io
Apache License 2.0
164 stars 42 forks source link

fail connecting to primary server: fail fetching server version #23

Closed perrygeo closed 2 years ago

perrygeo commented 2 years ago

Thanks for the great work on this! I'm running pg_exporter and I'm hitting an error on the precheck steps.

$ pg_exporter
INFO[0000] retrieve target url  from PG_EXPORTER_URL     source="pg_exporter.go:1938"
INFO[0000] retrieve config path pg_exporter.yaml from PG_EXPORTER_CONFIG  source="pg_exporter.go:2009"
ERRO[0000] fail connecting to primary server: fail fetching server version: driver: bad connection, retrying in 10s  source="pg_exporter.go:1517"

This appears to be where the query in question is made, SHOW server_version_num;

When I connect directly using psql at the same URI, I'm able to run it without issue.

$ psql $PG_EXPORTER_URL
psql (12.8 (Ubuntu 12.8-0ubuntu0.20.04.1), server 14.0 (Debian 14.0-1.pgdg110+1))
WARNING: psql major version 12, server major version 14.
         Some psql features might not work.
Type "help" for help.

postgres=# show server_version_num;
 server_version_num 
--------------------
 140000
(1 row)

I'm not sure how/if it matters but I'm accessing this db over a TCP proxy using kubectl proxy. It doesn't seem to impact any other postgres clients but worth mentioning.

What am I missing?

Vonng commented 2 years ago

This error msg is emit for dead network connection by the driver.

// ErrBadConn should be returned by a driver to signal to the sql
// package that a driver.Conn is in a bad state (such as the server
// having earlier closed the connection) and the sql package should
// retry on a new connection.

// ErrBadConn should only be returned from Validator, SessionResetter, or
// a query method if the connection is already in an invalid (e.g. closed) state.

try launching exporter inside kubernetes. If it works, then maybe something is wrong with the proxy.

A tcp traffic dump would be very helpful for this.

Vonng commented 2 years ago

There's another possibility: Timeout

The start-up - fetch version process has a hard-coded 100ms timeout.

If your RT between postgres & exporter is more than 100ms, which may leads to timeout.

This parameter will becomes to a config entry in next release.

Vonng commented 2 years ago

resolved by https://github.com/Vonng/pg_exporter/commit/9b8253b844764cad4f6530ee4933d4209e0eeec1