Open Ferrany1 opened 2 years ago
In your example the telegram.tmpl
isn't mounted on the container. Could this be the problem?
No, sorry I've copied latest version without mount, however I was testing with proper mount + I've checked file actually mounted at path
Are you configuring alertmanager.yml
uploading it with mimirtool alertmanager load
command (along with the template) or are you configuring it as fallback Alertmanager configuration?
As a fallback
mimir.yml part
alertmanager:
data_dir: /mimir/alertmanager
fallback_config_file: /etc/mimir/alertmanager.yml
external_url: http://127.0.0.1:8080/alertmanager
You raised a very good point. The alertmanager fallback configuration currently doesn't support templates. This is something we should fix.
As a workaround, could upload the alertmanager yaml config + templates using mimirtool alertmanager load
instead (doc) for the specific tenant?
I've managed it by putting message full message template (without def) into alertmanager.yml If you could point me onto loader I'me have a look into it and maybe make some pr with fixes, obviously if its needed and team not currently working on it
If you could point me onto loader I'me have a look into it and maybe make some pr with fixes, obviously if its needed and team not currently working on it
None is working on it and we would love your help! ❤️
The fallback config is loaded from here: https://github.com/grafana/mimir/blob/main/pkg/alertmanager/multitenant.go#L840
The alertmanagerFromFallbackConfig()
is a bit tricky. The way it works is creating an empty config definition and store it in the backend storage:
https://github.com/grafana/mimir/blob/3c8fabdbece41f894a49c7024cdd5982fa26924d/pkg/alertmanager/multitenant.go#L864-L868
Then we call setConfig()
which loads the fallback config if the config is empty (was forcefully set to empty in alertmanagerFromFallbackConfig
):
https://github.com/grafana/mimir/blob/3c8fabdbece41f894a49c7024cdd5982fa26924d/pkg/alertmanager/multitenant.go#L675-L684
Is someone working on it? Or we can think of contributing to it?
Is someone working on it? Or we can think of contributing to it?
None is working on it. You're welcome to contribute! ❤️
Definetly this is something I would love to have <3
As an idea, what about simply creating some little watcher for k8s to detect changes on a configmap with the configs and then use mimirtools to upload it from time to time?
Running into the same issue. Chose to try to use the fallback config as there's no way to configure alertmanager configs without mimirtool (don't want a manual step to configuring mimir). Is there any other solution currently? Anyone ever work on this? I'm not capable of doing it myself.
Running into the same issue. Chose to try to use the fallback config as there's no way to configure alertmanager configs without mimirtool (don't want a manual step to configuring mimir). Is there any other solution currently? Anyone ever work on this? I'm not capable of doing it myself.
As I said previously, if you run mimirtool in a cronjob, with your config and your templates loaded into it, you can upload your config, lets say, in 5m lapses periodically and it's automated. We use it that way and it's working well 😊
@pracucci Seems like I've managed to fix it, but I've no idea how to write tests to check it, since you're not testing alertmanager in mimir, and the only option for me is to put alerts for notifyer and they are executed directly to receivers.
I've tested it locally on such configs:
mimir.yaml:
target: all,alertmanager,ruler
multitenancy_enabled: false
no_auth_tenant: "anonymous"
blocks_storage:
backend: filesystem
bucket_store:
sync_dir: ./temp/tsdb-sync
filesystem:
dir: ./temp/data/tsdb
tsdb:
dir: ./temp/tsdb
compactor:
data_dir: ./temp/compactor
sharding_ring:
kvstore:
store: memberlist
distributor:
ring:
instance_addr: 127.0.0.1
kvstore:
store: memberlist
ingester:
ring:
instance_addr: 127.0.0.1
kvstore:
store: memberlist
replication_factor: 1
ruler:
alertmanager_url: http://127.0.0.1:8080/alertmanager
ruler_storage:
backend: local
local:
directory: ./temp/fs_rules
alertmanager:
data_dir: ./temp/alertmanager
fallback_config_file: ./alertmanager.yaml
external_url: http://127.0.0.1:8080/alertmanager
alertmanager_storage:
backend: filesystem
filesystem:
dir: ./temp/alerts
limits:
max_label_names_per_series: 100
server:
log_level: warn
http_listen_port: 8080
store_gateway:
sharding_ring:
replication_factor: 1
alertmanager.yaml:
route:
repeat_interval: 30s
group_interval: 60s
group_wait: 30s
receiver: 'telegram'
templates:
- './telegram.tmpl'
receivers:
- name: "telegram"
telegram_configs:
- bot_token: ''
chat_id: ''
api_url: https://api.telegram.org
message: '{{ template "telegram.message" . }}'
telegram.tmpl
{{ define "telegram.message" }}
test
{{ end }}
Sorry for taking it too long, I tottaly forgot about this issue for year
@pracucci Can you help me with pr?
@pracucci before I take a look at #6495, is it possible Mimir doesn't support templates in the fallback configuration on purpose to avoid a situation where the fallback configuration fails for the same reason as the main configuration (i.e. a shared, bad template)?
before I take a look at https://github.com/grafana/mimir/pull/6495, is it possible Mimir doesn't support templates in the fallback configuration on purpose to avoid a situation where the fallback configuration fails for the same reason as the main configuration (i.e. a shared, bad template)?
We should ask @gotjosh and @stevesg cause they know better. I don't remember any discussion where we decided to not do it on purpose. I've more the feeling this was an oversight from us.
However, I think we should ideally validate the fallback config and not start the alertmanager if some required templates are missing.
However, I think we should ideally validate the fallback config and not start the alertmanager if some required templates are missing.
The main issue here is that the template in the fallback configuration can fail at runtime. Not because it's absent on disk, but because of a syntax error in the template or it attempts to access a field in a struct which does not exist. A lot of this can be mitigated with static analysis, but it's a lot of work.
Do you still need my pr attached to this issue, or I can abandon it?
Describe the bug
Alertmanager can't parse definition from custom template, resulting into empty telegram message error on send try
To Reproduce
Steps to reproduce the behavior:
docker-compose.yaml
alertmanager.yml
alertmanager.yml
telegram.tmpl
Expected behavior
Alert sent to telegram message
Environment
Additional Context
I've tested template via https://github.com/prometheus/alertmanager/blob/main/template/template_test.go with '{{ template "telegram.message" . }}' and everything works correctly, I haven't tried to deploy standalone Prometheus Alertmanager, but it seems it may work fine.
Currently to make everything work I've putted template message part itself fully into alertmanager.yml message field without ref and works ok.