prometheus-community / postgres_exporter

A PostgreSQL metric exporter for Prometheus
Apache License 2.0
2.76k stars 734 forks source link

Unable to use as a Kubernetes sidecar container #194

Closed ball-hayden closed 6 years ago

ball-hayden commented 6 years ago

Inspired by the stable/postres helm chart, I'm attempting to run postgres_exporter on kubernetes as a sidecar container:

https://github.com/kubernetes/charts/blob/2091efdf343b26b8fdfd1cce75fce3ae7b626c94/stable/postgresql/templates/deployment.yaml#L111-L126

If I try to do this, I see the following output:

time="2018-06-01T14:46:45Z" level=info msg="Established new database connection." source="postgres_exporter.go:995"
time="2018-06-01T14:46:45Z" level=info msg="Error while closing non-pinging DB connection: <nil>" source="postgres_exporter.go:1001"
time="2018-06-01T14:46:45Z" level=info msg="Error opening connection to database (user=postgres%20host=127.0.0.1%20database=playerdata%20sslmode=disable): dial tcp 127.0.0.1:5432: connect: connection refused" source="postgres_exporter.go:1030"
time="2018-06-01T14:46:45Z" level=info msg="Starting Server: :9187" source="postgres_exporter.go:1137"

which makes sense, as postgres hasn't started yet. The exporter container continues to run and appears to be listening on port 9187, but any attempts to retrieve metrics time out:

root@test:/# curl http://172.30.148.191:9187/metrics
curl: (7) Failed to connect to 172.30.148.191 port 9187: Connection timed out

Does the exporter attempt to reconnect, or will this never work as a sidecar?

ball-hayden commented 6 years ago

Oh. :man_facepalming: My NetworkPolicy was blocking access, both from my test container and Prometheus. Everything looks to be working quite happily. Apologies.

inyee786 commented 5 years ago

@ball-hayden can you give the step make it run?

dominik-bln commented 5 years ago

We are facing the same error with the official postgres Helm chart without Network Policy enabled. Any further hints here would be appreciated.

ball-hayden commented 5 years ago

I expect this is an unrelated issue - my problem was definitely a missing network policy. For completeness though, this was the manifest I was using:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres-master
  namespace: database
spec:
  serviceName: postgres-master
  replicas: 1
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      app: postgres-master
  template:
    metadata:
      labels:
        app: postgres-master
    spec:
      containers:
        - name: postgres
          image: <customised postgres container>
          imagePullPolicy: Always
          ports:
            - containerPort: 5432
          env:
            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgres-master
                  key: password
          volumeMounts:
            - mountPath: /var/lib/postgresql/data
              name: pgdata
        - name: postgres-exporter
          image: wrouesnel/postgres_exporter:v0.4.6
          command: ["/postgres_exporter", "--extend.query-path", "/config/queries.yaml"]
          ports:
            - name: pg-metrics
              containerPort: 9187
          env:
            - name: DATA_SOURCE_NAME
              value: user=postgres host=127.0.0.1 database=playerdata sslmode=disable
          volumeMounts:
            - mountPath: /config
              name: postgres-exporter-config
      volumes:
        - name: postgres-exporter-config
          configMap:
            name: postgres-exporter-config
            defaultMode: 0555
  volumeClaimTemplates:
    - metadata:
        name: pgdata
        labels:
          billingType: "monthly"
      spec:
        accessModes:
          - ReadWriteMany
        resources:
          requests:
            storage: 20Gi
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: postgres-exporter-config
  namespace: database
data:
  queries.yaml: |-
    pg_database:
      query: " SELECT pg_database.datname, pg_database_size(pg_database.datname) as size FROM pg_database"
      metrics:
        - datname:
            usage: "LABEL"
            description: "Name of the database"
        - size:
            usage: "GAUGE"
            description: "Disk space used by the database"