grafana / helm-charts

Apache License 2.0
1.67k stars 2.29k forks source link

Grafana alerting HA using the Grafana helm chart #2408

Open Nog-Frog opened 1 year ago

Nog-Frog commented 1 year ago

Hey, I'm using the Grafana helm chart from this repo. I use a HA setup with 3 replicas. I set unified_alerting to True. My team uses a Slack integration to receive alerts. Whenever an alert enters firing state, we receive 3 notifications on it.

I found this guide from grafana: https://grafana.com/docs/grafana/latest/alerting/set-up/configure-high-availability/#enable-alerting-high-availability-using-kubernetes

But using the current service.yaml template this does not seem possible, as it seems I may only specify 1 port for the service whereas this guide required me to specify an additional port for alerting.

Am I missing something that will make this possible anyway? Thanks in advance.

Kot-o-pes commented 1 year ago

+1, you also need to deploy additional kind:Service to make it running, so far we have to make workaround with kubectl to achieve this

IgorArg commented 1 year ago

Have the same issue. the guide looks like chart needs to be customized in order to create new service for HA. Also i've found short instruction https://github.com/grafana/helm-charts/tree/main/charts/grafana#high-availability-for-unified-alerting without creating service, but i didnt try it yet.

iXingo commented 1 year ago
As next step you have to setup the grafana.ini in your values.yaml in a way that it will make use of the headless service to obtain all the IPs of the cluster. You should replace {{ Name }} with the name of your helm deployment.

grafana.ini:
  ...
  unified_alerting:
    enabled: true
    ha_peers: {{ Name }}-headless:9094
    ha_listen_address: ${POD_IP}:9094
    ha_advertise_address: ${POD_IP}:9094

  alerting:
    enabled: false

Does it mean I can use any name I want?

IgorArg commented 1 year ago

Have the same issue. the guide looks like chart needs to be customized in order to create new service for HA. Also i've found short instruction https://github.com/grafana/helm-charts/tree/main/charts/grafana#high-availability-for-unified-alerting without creating service, but i didnt try it yet.

looks like it works. i use the name of deployment here ha_peers: {{ Name }}-headless:9094 . didn't try something else.

cmitri-harris commented 1 year ago

@Kot-o-pes you don't need to deploy additional service, helm takes care of it by enabling "headlessService" in values.

japtain-cack commented 9 months ago

Environment

Talos linux deployed with Sidero Metal Talos Version: 1.4.7 Kubernetes Version: 1.27.4 Grafana Installation: Helm

Issue

I was having trouble getting this working as well. I configured the grafana.ini just as @iXingo described, replacing {{ Name }} with grafana-headless:9094. However, DNS resolution is broken for me and I resolved that by adding the GODEBUG=netdns=go environment variable as mentioned here. That resolved my DNS resolution issues however, the headless service endpoint IPs were stuck in NotReadyAddresses. This is because without the liveness/readiness probe completing, it will never set the endpoint IPs as ready. So I had to remove the probes. Once I did that the IP was enabled in the headless service endpoint and things started working.

Is the headless service/endpoint documented anywhere? I didn't see anything mentioning modifying/removing any of the probes, so I assume it should work with them enabled. However, for me, this was not the case.