redhat-cop / monitoring

Assets to manage monitoring infrastructure and applications
8 stars 12 forks source link

Error Deploying AlertManager on OpenShift #6

Open sabre1041 opened 6 years ago

sabre1041 commented 6 years ago

The following error occurs when trying to deploy AlertManager in OpenShift as part of the Prometheus stack

level=info ts=2018-08-17T16:35:46.660653589Z caller=main.go:174 msg="Starting Alertmanager" version="(version=0.15.0, branch=HEAD, revision=462c969d85cf1a473587754d55e4a3c4a2abc63c)"
level=info ts=2018-08-17T16:35:46.660738988Z caller=main.go:175 build_context="(go=go1.10.3, user=root@bec9939eb862, date=20180622-11:58:41)"
level=info ts=2018-08-17T16:35:46.667289233Z caller=cluster.go:155 component=cluster msg="setting advertise address explicitly" addr=10.129.1.164 port=9094
level=info ts=2018-08-17T16:35:46.670038371Z caller=cluster.go:561 component=cluster msg="Waiting for gossip to settle..." interval=2s
level=info ts=2018-08-17T16:35:46.670291123Z caller=main.go:311 msg="Loading configuration file" file=alertmanager.yml
level=error ts=2018-08-17T16:35:46.670737611Z caller=main.go:314 msg="Loading configuration file failed" file=alertmanager.yml err="scheme required for webhook url"
makentenza commented 6 years ago

I noticed this error as well, the template we have is expecting AlertManager to be configured with RocketChat WebHooks to send notifications to. We should be able to allow users to configure more alternative endpoints than RocketChat, but that will require some logic to be implemented, like different templates or ideally Jinja code on the template and process it using Ansible, but openshift-applier doesn't support that so we need to figure out how do we want to implement this, not sure if a pre-step for the applier will work as we need to hack the template source to be passed to applier in some way even before this exists...

Any thoughts @sabre1041 @oybed ?

oybed commented 6 years ago

@makentenza yeah, that was one of my comments to the original PR - i.e.: to ensure RocketChat was optional. Sounds like maybe it isn't? Either way, we could potentially use tags with the openshift-applier. However, this requires breaking the big template file into smaller ones as also commented on the initial PR. I'd say let's start with the latter and see what we can do with tags.

Using "smaller" templates is how it was done for Dynatrace to allow for greater control of these sort of things: https://github.com/redhat-cop/monitoring/pull/3

makentenza commented 6 years ago

@oybed that doesn't solve the problem as we need to change the template depending on what the user wants to use.

https://github.com/redhat-cop/monitoring/blob/master/prometheus/openshift/files/templates/metrics.yml#L1131-L1135

We need to convert this ConfigMap into Jinja Template and add conditional blocks there to configure different receivers or not configure receivers at all, and this is the point where the applier doesn't solve the requirement.

oybed commented 6 years ago

@makentenza I do believe my earlier proposal would solve it and I actually believe the solution would be quite elegant and possibly even reusable across multiple solutions. We can discuss more next time we meet, and I'll see what I can do to do a quick PoC to provide more info that way.