carlosedp / cluster-monitoring

Cluster monitoring stack for clusters based on Prometheus Operator
MIT License
740 stars 200 forks source link

metallb module has improper RBAC setup #86

Closed Nashluffy closed 4 years ago

Nashluffy commented 4 years ago

When enabling the metallb module, no metrics are appearing in Grafana. Looking at the logs from prometheus-k8s pod

level=error ts=2020-08-16T04:40:30.313Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:362: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"metallb-system\""
level=error ts=2020-08-16T04:40:30.430Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:363: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"metallb-system\""
level=error ts=2020-08-16T04:40:30.843Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:361: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"metallb-system\""
level=info ts=2020-08-16T04:40:31.391Z caller=main.go:799 msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
level=info ts=2020-08-16T04:40:31.405Z caller=kubernetes.go:253 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=warn ts=2020-08-16T04:40:31.406Z caller=klog.go:86 component=k8s_client_runtime func=Warningf msg="/app/discovery/kubernetes/kubernetes.go:361: watch of *v1.Endpoints ended with: an error on the server (\"unable to decode an event from the watch stream: context canceled\") has prevented the request from succeeding"
level=error ts=2020-08-16T04:40:31.422Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:362: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"metallb-system\""
level=error ts=2020-08-16T04:40:31.435Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:363: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"metallb-system\""
level=error ts=2020-08-16T04:40:31.437Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:361: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"metallb-system\""
level=info ts=2020-08-16T04:40:31.458Z caller=main.go:827 msg="Completed loading of configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
level=error ts=2020-08-16T04:40:32.440Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:363: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"metallb-system\""
level=error ts=2020-08-16T04:40:32.464Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:361: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"metallb-system\""
level=error ts=2020-08-16T04:40:32.659Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:362: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"metallb-system\""

To me it looks like an RBAC issue. I think metallb module would be configured similar to the arm-exporter module with it's own Role, RoleBinding, and ServiceAccount.

Nashluffy commented 4 years ago

Proposed fix is PR https://github.com/carlosedp/cluster-monitoring/pull/87

carlosedp commented 4 years ago

Fixed by #87.

jontg commented 3 years ago

I did not see this change fix the problem, since prometheus is not bound to the clusterRole (instead an unused metallb-exporter is). Prometheus itself scrapes the metrics for metallb, so we need to bind the prometheus-k8s role to the metallb-exporter clusterRole — I've filled a follow-on ticket https://github.com/carlosedp/cluster-monitoring/issues/98 with a proposed fix here.