wkulhanek / openshift-prometheus

Repository for all things related to Prometheus on OpenShift
51 stars 41 forks source link

Back-off restarting failed container #19

Open cake99 opened 6 years ago

cake99 commented 6 years ago

Hello,

I have followed the instructions and it seems overall ok (i have three nodes running the node_explorer containers, only missing the one on the master).

It constantly restarting. I have this in the logs:

FailedSync error determining status: rpc error: code = 2 desc = Error response from daemon: {"message":"devmapper: Unknown device 83da9c1200eae4ae9ac90f1c8c03fa5087b588f1e1d82fdde00cc4df5638d385"}

In the alert-proxy pod, I have this:

2018/03/16 12:22:43 oauthproxy.go:657: 10.131.0.44:50208 Cookie "_oauth_proxy" not present

  | 2018/03/16 12:22:43 provider.go:347: authorizer reason: User "system:anonymous" cannot get namespaces in project "openshift-metrics"

Any ideas what I did wrong?

Thanks a lot!

wkulhanek commented 6 years ago

Hi @cake99 Are you mixing this container and the prometheus pod from OpenShift? Your mention of 'openshift-metrics' seems to indicate that. While you can install a Prometheus automatically into OCP it will do so in a Pod that uses oauth-proxies to secure both Prometheus and Alertmanager. The bigger problem then is that Grafana can't connect to Prometheus behind the proxy. :-(

As for the error message: that means that the user that's being used to run the pod is not authorized to access the openshift-metrics project.

If you are using the template from this repo it will set up a clusterresourcebinding that will give prometheus the correct permissions.