grafana / grafana

The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
https://grafana.com
GNU Affero General Public License v3.0
65.12k stars 12.15k forks source link

Grafana External Alert Manager: Does not respect tlsSkipVerify flag #73685

Open Amit-Maersk opened 1 year ago

Amit-Maersk commented 1 year ago

What happened?

We use our in-house service that act as alert manager. We have added this service as one of the alertmanager datasource where all grafana managed alerts, Mimir & Loki alert are sent. This works perfectly fine till we enabled SSL on our alertmanager service. After enabling SSL on the alert manager service we updated external alertmanager config, mimir ruler alert manager config & loki ruler alertmanager config to use the certificate & key while communicating with alertmanager service. The Configuration looks like below

Mimir & Loki Ruler Configuration --- Works

 - -ruler.alertmanager-url=https://<service-name>.<namespace>:<port>
 - -alertmanager_client.tls_cert_path=/path-to-file/client.crt
 - -alertmanager_client.tls_insecure_skip_verify=true
 - -alertmanager_client.tls_key_path=/path-to-file/client.key

Grafana External Alertmanager configuration --- Does not Work

- access: "proxy"
        editable: false
        name: "AlertManagerService"
        uid: "alertmamanagerservice"
        orgId: 1
        type: "alertmanager"
        url: "https://<service-name>.<namespace>:<port>"
        version: 1
        jsonData:
          handleGrafanaManagedAlerts: true
          implementation: prometheus
          tlsAuth: true
          tlsSkipVerify: true
        secureJsonData:
          tlsClientCert: "/path-to-file/client.crt"
          tlsClientKey: "/path-to-file/client.key"

image

ERROR

The error that I see in the grafana logs

logger=ngalert.sender.external-alertmanager t=2023-08-23T13:03:07.184666046Z level=error alertmanager=https://\.\:\/api/v2/alerts count=1 msg="Error sending alert" err="Post \"https://\.\:\/api/v2/alerts\": tls: failed to verify certificate: x509: certificate signed by unknown authority"

What did you expect to happen?

Alertmanager should consider tlsSkipVerify flag and should not validate server certificate. If this tlsSkipVerify is working fine as expected then I am not sure why the same configuration works for Mimir & Loki Ruler , but not for Grafana Alert manager

Did this work before?

Yes the setup works fine if we remove certificates from everywhere (Grafana alert manager, Mimir Ruler, Loki Ruler & Alertmanager service)

How do we reproduce it?

  1. Add a self signed certificate to the alert manager service
  2. Configure client key & certificate in Grafana alertmanager setting
  3. Trigger a grafana managed alert

Is the bug inside a dashboard panel?

No response

Environment (with versions)?

Grafana: 10.0.3

Grafana platform?

Kubernetes

Datasource(s)?

No response

rwwiv commented 1 year ago

This is a bug and needs to be addressed, we currently do not use this flag when configuring the connection to an external Alertmanager (see here).