Open marwanad opened 1 year ago
That's correct, thanks for raising this.
Alertmanager is a statefulset, but with best-effort emptyDir volume which does not guarantee any persistence. In self-deployment that's possible, since you can modify the Alertmanager resource, but not in managed GMP.
We could discuss this feature as a team if you want, it feels like something we could consider, but of lower priority. Also help wanted to contribute this feature, might get it done faster.
Just curious what's your use case for managed alertmanager? Would our recent cloud feature in preview PromQL for Cloud Monitoring Alerting help?
@bwplotka thanks for the response! I think there was no way to disable the deployment of the managed alertmanager through the GMP operator at the time so we ended up utilizing it instead of having duplicate deployments.
So it's basically the same use-case for an unmanaged alertmanager, at the time we couldn't define PromQL rules in cloud monitoring + we needed more control over the slack notification channel configs, pagerduty etc. The preview feature looks interesting and covers a subset of our use-case but we'll still need alertmanager for generic webhook channels.
Note that Cloud Alerting PromQL does support generic webhook channels: https://cloud.google.com/monitoring/support/notification-options#webhooks
We are facing the same problem. All of our silences are gone on pod restart and we need to recreate all of them manually. In the last two weeks it happened two times. So this improvement would also be very helpful for us!
Sorry for lag, it's on our radar again, we are brainstorming how to enable persistent volumes here.
Interestingly there is a very nasty "persistent" workaround for silences in the meantime https://github.com/prometheus/alertmanager/issues/1673#issuecomment-819421068 (thanks @TheSpiritXIII for the finding!)
Just quick question to users who care about this feature, which managed collection (this operator) deployment model you use?
1️⃣ the one available on GKE (fully managed). If that's the case, how you submit the silences? 2️⃣ self-deployed operator (via kubectl). If that's the case, what stops you from manually adjusting Alertmanager Statefulset YAML for your needs and re-applying it? Operator will managed that one (as long as you keep the labels, namespace and name the same) just fine.
cc @m3adow @marwanad @taldejoh
@bwplotka appreciate the updates on this :)
We were using option 1 and setting the silences by port-forwarding to the running alertmanager instance and adding them through the UI or using amtool
to submit them.
We've then switched to a self deployed alertmanager instance to get more control over this and setting alertmanagers
field in the operator config to point to our self-managed instance.
We're using option 1 as well. We're currently in the process of migrating from kube-prometheus-stack to GMP and we want to have as much of the "GM", as possible. 😄
Right now, we're also using port-forwarding and the UI to silence alerts. As the alerts are sent to Teams channels, we don't have an option to silence the alerts later on in the alerting chain.
Epic, thanks for clarifications!
In the managed alertmanager,
alertmanager-data
is ofEmptyDir
which means that configured silences and notification states won't persist on pod restarts. Is there a way to have a configurable PVC for the data dir with the managed alertmanager?