integr8ly / application-monitoring-operator

Operator for installing the Application Monitoring Stack on OpenShift (Prometheus, AlertManager, Grafana)
Apache License 2.0
30 stars 45 forks source link

Prometheus not reaching Alertmanger - Error 403 #99

Closed IamRFC1918 closed 4 years ago

IamRFC1918 commented 4 years ago

Hello,

after a fresh installation on my OKD Cluster, this error occurs frequently in the prometheus log:

level=error ts=2019-12-04T13:13:48.741Z caller=notifier.go:528 component=notifier alertmanager=https://10.128.2.31:9091/api/v1/alerts count=1 msg="Error sending alert" err="bad response status 403 Forbidden"

Does someone know how to fix this? I think this is related to the oauth proxy in front of the Alertmanager.

Best regards,

Frank

IamRFC1918 commented 4 years ago

I think the Problem is, the Clusterrole prometheus-application-monitoring does not have get permissions on namespaces. This is required by the oauth proxy in front on the alertmanager:

`oc describe clusterrole.rbac prometheus-application-monitoring  ✔  10147  10:24:24 Name: prometheus-application-monitoring Labels: Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{},"creationTimestamp":"2019-11-21T11:05:24Z",... PolicyRule: Resources Non-Resource URLs Resource Names Verbs


tokenreviews.authentication.k8s.io [] [] [create] subjectaccessreviews.authorization.k8s.io [] [] [create] endpoints [] [] [get list watch] nodes [] [] [get list watch] pods [] [] [get list watch] services [] [] [get list watch] [/metrics] [] [get] configmaps [] [] [get]`

alertmanager-proxy: Container ID: docker://97c1cbfe73906a3b6284dedbd9c6601b5f654eb8941b582a36044c5685fe863c Image: quay.io/openshift/origin-oauth-proxy:4.2 Image ID: docker-pullable://quay.io/openshift/origin-oauth-proxy@sha256:9f0c0a0981c0c9295880e5ba209b2c102438a560b0a622b44dac766c67ea4cd4 Port: 9091/TCP Host Port: 0/TCP Args: -provider=openshift -https-address=:9091 -http-address= -email-domain=* -upstream=http://localhost:9093 -openshift-sar={"resource": "namespaces", "verb": "get"} -openshift-delegate-urls={"/": {"resource": "namespaces", "verb": "get"}}

after adding this permissions to the clusterrole, the Alertmanager receives alerts from prometheus.

Can someone confirm this? If this will confirmed I would like to open a Pull request to fix that.

BR,

Frank

pb82 commented 4 years ago

Hey Frank, sorry for the delay. You are correct, the Prometheus serviceaccount needs to have cluster level permissions on namespaces. We've since fixed this in our upstream rbac definitions but it should be fixed here too. I'll create an issue for that.

Thanks a lot for digging into that!

pb82 commented 4 years ago

fixed now with #102