airflow-helm / charts

The User-Community Airflow Helm Chart is the standard way to deploy Apache Airflow on Kubernetes with Helm. Originally created in 2017, it has since helped thousands of companies create production-ready deployments of Airflow on Kubernetes.
https://github.com/airflow-helm/charts/tree/main/charts/airflow
Apache License 2.0
630 stars 474 forks source link

Enable querying of PgBouncer metrics by supporting additional PgBouncer configuration #819

Closed jbvaningen closed 2 months ago

jbvaningen commented 5 months ago

Checks

Motivation

In my company, we would like to start exporting PgBouncer metrics to Prometheus via prometheus-community/pgbouncer_exporter. We are already monitoring the other Airflow components via Prometheus, by using the StatsD exporter. The PgBouncer Prometheus metrics exporting is also described as an option in https://github.com/airflow-helm/charts/issues/570.

Adding the PgBouncer metrics exporter was straightforward and required only configuration (I put some details below), except for one blocking issue: the pgbouncer.ini file, which is completely managed through the User-Community Airflow Helm Chart, needs to have one additional configuration line.

What additional PgBouncer configuration is needed and why?

To query PgBouncer metrics, you need read-only access to the PgBouncer console. As can be found in https://www.pgbouncer.org/config.html#console-access-control, by default all console access is disabled.

Our problem would be solved if the User-Community Airflow Helm Chart allows us to configure the value of stats_users. The problem could also be solved by allowing the change of admin_users, but we prefer the former given the principle of least privilege.

How we add PgBouncer metrics exporter

It requires three changes:

  1. Add sub-chart to Chart.yaml
    description: Our Company Airflow Helm Chart
    name: our-company-airflow
    version: 1.0.0
    appVersion: 2.7.3
    icon: https://avatars.githubusercontent.com/u/71061241
    home: https://gitlab.com/our-company/airflow/-/tree/main/charts/our-company-airflow
    maintainers:
    - name: our-team
    keywords:
    - airflow
    - dag
    - workflow
    dependencies:
    - name: airflow
    version: 8.8.0
    repository: https://airflow-helm.github.io/charts
    - name: prometheus-statsd-exporter
    version: 0.11.0
    repository: https://prometheus-community.github.io/helm-charts
    condition: prometheus.enabled
    # Add the below lines
    - name: prometheus-pgbouncer-exporter
    version: 0.1.1
    repository: https://prometheus-community.github.io/helm-charts
    condition: prometheus.enabled
  2. Add appropriate configurations to values-<environment>.yaml:

    
    ---
    airflow:
    # ... (lots of other stuff here obviously)
    
    pgbouncer:
    livenessProbe:
      timeoutSeconds: 15
    
    # We would like to ADD this config option, it does NOT exist yet
    statsUsers: "<airflow-user-name>"

... (more configs)

example configuration of prometheus-pgbouncer-exporter sub-chart

prometheus-pgbouncer-exporter: postgresql: enabled: false # disable postgresql sub-chart image: tag: v0.7.0 config: datasource: host: "" user: "" passwordSecret: name: "" key: "" port:


3. Deploy appropriate `PodMonitoring` resources to ensure the `/metrics` endpoint of the `prometheus-pgbouncer-exporter` pod get scraped by Prometheus

### Installation details
- Python `3.10`
- Airflow `2.7.3`
- User-Community Airflow Helm Chart `8.8.0`
- Deployed to a GKE cluster version `1.25.15-gke.1115000`

### Similar issues
- https://github.com/airflow-helm/charts/issues/500 is very similar, but the discussion is stale.
- https://github.com/airflow-helm/charts/issues/570 would probably also solve this issue, but the scope is much larger and I don't see any recent activity 

### Implementation

Add one more configuration option to the `pgbouncer` section of the User-Community Airflow Helm Chart, which is used during the generation of the `pgbouncer.ini` file (this happens in https://github.com/airflow-helm/charts/blob/main/charts/airflow/templates/pgbouncer/_helpers/pgbouncer.tpl)

I can create a PR with this change.

### Are you willing & able to help?

- [X] I am able to submit a PR!
- [X] I can help test the feature!