Closed Jose-Matsuda closed 2 years ago
Port forwarding alertmanager and navigating to http://localhost:9093/#/status gives me the current config, need to find out where this is done / how I can change it say via argocd.
Though note that it does say
Can maybe use a configmap?
Note that it is mounted here (in the alertmanager pod)
is controlled by
and is populated with the values
which is the default from the chart
Prometheus Rules Note that some already exist in the volume
In the PR below I was able to get our own test Prometheus Rules to be recognized and in there
Pat has graciously directed me to the following repos on how we end up with Prometheus etc in the cluster.
We have the specific terraform-kubernetes-kube-prometheus-stack to install the stack, which is referenced by the generic terraform-statcan-kubernetes-core-platform which is referenced by our specific repo terraform-statcan-aaw-platform for our own clusters
Pat also gave more insight, saying that similar to how we 'custom' set the disk space for Prometheus, we will probably need to make a variable to pass down the chain. First at the statcan-aaw repo we just need to declare and then make the TF_VAR in the git secrets.
This has to get passed down to now the core-platform.
Having said that, if I take a look at chart I can see this. Maybe we can get away with configuring this, and then argocd side we can make changes to the configmap as we see fit? Though I am unsure about what happens if we change the configmap while its going.
^ This almost makes sense if I was going along with what I knew earlier, but a little bit further down below you see
configSecret
which matches the pictures I have above in terms of location. The "regular" configuration seems to be taken from here, which populates the secret --> the problem with this is that it really only uses .Values.alertmanager.config
and nothing else.
configSecret
I think we can use this, and if that is the case we do not need to do much variable passing (just need to enable the option) as we ourselves can control the config via argocd with a secret, would just need to restart alertmanager when it is updated(?). TODO:
(as of 12/09/2022)
DEV references terraform-statcan-aww-platform v3.7.0, which references terraform-statcan-kubernetes-core-platform v1.7.0, which does reference the terraform-kubernetes-kube-prometheus-stack v2.0.0 (not much here, just focus on the k8s-core-platform as that contains the actual values)
Checking this was important to make sure that we were not missing any key upgrades.
secret
routeWe will likely want create and make the secret in our various TF file like in
Where data would be something like
We would need to base64 encode whatever configuration we want and then put that as a secret in the repo. Remember that the configuration needs to match what they specify in the docs, (ala that alertmanager.yaml
in /etc/alertmanager/config
In this gist, format / look of this may change depending on if we go with modifying the secret then we will need to encode it etc.
Closing, will create a new thing to track CNS
Current Status 20/09/2022
[ ] Needs a bit of elaboration on the other side, align things architecturally so the cns alerting and this alerting can work side by side. Maybe by end of this week (Friday 23rd)
Taken off of Pat's comment
Steps
PrometheusRule
in argocd that monitors the ElasticSearch pvcs, I would hope that this just gets picked up (just try it), also thisNeed to make changes in the values here, at this alertmanager probably
Possible other issues to read through https://github.com/prometheus-community/helm-charts/issues/393
Important