Icinga / helm-charts

Kubernetes Helm charts to deploy a ready-to-use Icinga monitoring stack.
https://icinga.com
Apache License 2.0
9 stars 13 forks source link

[Feature]: Add readiness/liveness probes for components of the icinga-stack chart #3

Open mocdaniel opened 1 year ago

mocdaniel commented 1 year ago

Affected Chart

icinga-stack

Please describe your feature request

We should introduce proper health checks for components of the icinga-stack chart.

For Icinga2, we could hit the API at https://<host>:<api-port>/v1, once we receive a 401 Unauthorized, Icinga2 and its API should be up and running (there's no health endpoint, unfortunately).

For Icingaweb2, we could just hit the webpage.

Director and IcingaDB remain to be evaluated.

Donien commented 1 year ago

icinga2 chart

Regarding the icinga2 chart we can use an httpGet probe, passing a header for basic auth.
We'd need to fill it with credentials (static icingaweb + .Values.global.api.users.icingaweb.password) and pass it base64 encoded.

It could look something like this:
(charts/icinga-stack/charts/icinga2/templates/statefulset.yaml)

readinessProbe:
  failureThreshold: 3
  httpGet:
    scheme: HTTPS
    path: /v1
    port: {{ .Values.service.port }}
    httpHeaders:
      - name: Authorization
        value: Basic {{ printf "%s:%s" "icingaweb" .Values.global.api.users.icingaweb.password | b64enc }}

For the liveness probe we could just pick a higher failureThreshold. I think it's fair to say that we can just assume the chart to work if the icinga2 API answers our request.

icingaweb2 chart

I've already noticed a problem regarding this chart. Simply checking the base URL of that container does not work (using an httpGet probe).
The problem is twofold:

Within a probe we also can't do an equivalent of curl -L "USER:PASSWORD@DOMAIN:PORT/".

This means that we can't authenticate against its main page.

We could however try to check for static images such as /img/favicon.png. This would only tell us that an image of that name can be found though.
Any other ideas are appreciated here :)

Possible example:
(charts/icinga-stack/charts/icingaweb2/templates/deployment.yaml)

readinessProbe:
  failureThreshold: 3
  httpGet:
    scheme: HTTP
    path: /img/favicon.png
    port: {{ .Values.service.port }}

Also, would probes in charts/icinga-stack/templates/internal-databases.yaml be usefull?

mocdaniel commented 1 year ago

Regarding icinga2: I like the proposition, feel free to implement it this way.

Regarding icingaweb2: That's probably the best way to do it right now. As you mentioned, Icingaweb2 doesn't support BasicAuth and the getHttp probes are somewhat limited in their configurability.

Regarding internal-databases: Those use mariadb:latest by default, which comes with its own healthcheck script preinstalled: https://github.com/MariaDB/mariadb-docker/blob/master/healthcheck.sh

We'd need to invoke this properly for our use case. Feel free to take a look at the linked script and write an draft an exec probe :)