Add monitoring configuration to CNPG

paradedb / charts

ParadeDB Helm Charts, based on the official CloudNativePG Helm Charts

https://paradedb.github.io/charts/

GNU Affero General Public License v3.0

2 stars 0 forks source link

Add monitoring configuration to CNPG #4

Closed philippemnoel closed 1 week ago

philippemnoel commented 1 month ago

What I've disabled PodMonitor, as it requires enabling the Grafana and Prometheus charts:

https://cloudnative-pg.io/documentation/1.23/quickstart/#part-4-monitor-clusters-with-prometheus-and-grafana

In order to properly assist customers in a BYOC environment, we should reenable those so we can export logs over (at least Prometheus logs) that we can plot to see what's wrong.

Why ^

How You can see the tutorial for adding Prometheus/Grafana here: https://cloudnative-pg.io/documentation/current/quickstart/

philippemnoel commented 1 month ago

Note that I think we probably want Prometheus, but I'm not 100% sure we want Grafana. Perhaps just piping Prometheus metrics directly to us is easier, and we can have a single unified dashboard? Maybe starting with just Prometheus is lower overhead and a good first step.

vaibhawvipul commented 1 month ago

taking this up.

philippemnoel commented 1 month ago

This is already configured in the paradedb/byoc repo as well, so we can probably just take it from there

philippemnoel commented 1 month ago

From Mauricio:

[20:49, 14/8/2024] Mauricio Araujo: Hey Phil, they are not included, those need to be installed as separate charts. The cnpg chart only installs the operator and the necessary definitions for that. The operator has a metrics exporter which essentially exposes the metrics to Prometheus, but Prometheus itself has to be installed and configured to scrape the cnpg metrics, it wont do that by default. Also for Grafana you need to configure the cnpg dashboard [20:49, 14/8/2024] Mauricio Araujo: That is all explained here: https://cloudnative-pg.io/documentation/1.23/quickstart/#part-4-monitor-clusters-with-prometheus-and-grafana

philippemnoel commented 1 week ago

https://github.com/cloudnative-pg/grafana-dashboards/blob/main/charts/cluster/grafana-dashboard.json

^ We might need to bring this back, why was it moved to a dedicated repository @itay-grudev?

itay-grudev commented 1 week ago

It makes maintenance of the dashboard easier. We were also planning to expand the dashboards.

That being said the Grafana dashboard is part of the operator chart. I strongly discourage you from maintaining a copy of it. Users should be encouraged to use the official operator chart.

itay-grudev commented 1 week ago

As per monitoring and installing Prometheus and/or Grafana - that is up to the user and disabled by default in both charts. I recommend using the kube-prometheus-stack helm chart which can be configured separately.