Custom Metrics Verification and Debugging

hawkular / hawkular-openshift-agent

A Hawkular feed that collects metrics from Prometheus and/or Jolokia endpoints deployed in one or more pods within an OpenShift node.

16 stars 21 forks source link

Custom Metrics Verification and Debugging #59

Open mwringe opened 7 years ago

mwringe commented 7 years ago

Currently there is no good way for a user to know if they configured their custom metrics properly.

They would need to wait and see if metrics start appearing, and if they don't show up there is no way to know why its not showing up (eg is the metric url wrong? is the metric url right but its denying access? does the metric name have a typo? is the metric configmap not exactly proper yaml and it can't be processed?, has the agent failed or is not yet installed?).

Ideally we should have some way to let the user know there is something wrong with their setup. Users will not have access to the agent's logs or even know if the agent is installed.

I don't know if it makes sense to expose an http endpoint in the agent to return the status for a pod (secured of course) or if there would be a better way of doing this.

jmazzitelli commented 7 years ago

Is there anyway in k8s API for a service like the agent to write to its own config? Like write its own pod annotation(s) or maybe a new configmap?

Perhaps it would write annotations or even a configmap with status. Like maybe have these pod annotations get created by the agent:

hawkular-openshift-agent/endpoint1: http://the-endpoint:8080/metrics: OK hawkular-openshift-agent/endpoint2: http://another-one:9191/jolokia-war: Status 404: Not Found hawkular-openshift-agent/endpoint3: https://blah/metrics: Error: Invalid fooblat caused by bad splingbar

I'm sure that isn't what annotations or even configmaps are for. Just an idea. Though I do like having an status endpoint as the better idea.

mwringe commented 7 years ago

Is there anyway in k8s API for a service like the agent to write to its own config? Like write its own pod annotation(s) or maybe a new configmap?

Yes, we can grant permissions to the agent to be able to modify its own configmap or annotations.

Its also technically possible for the agent to modify this for all components in the cluster, but there is no way we can get approval to do this. Nor do I think we should, I don't think this is really the right approach.

Though I do like having an status endpoint as the better idea.

Yeah, I am beginning to think this might be the only good option. It would be a bit of a tricky thing to setup with security.

I don't think we need to worry about this stuff today, but I think its something we should be thinking about

jmazzitelli commented 7 years ago

PR #90 is supposed to address this issue