redhat-performance / odf-grafana

Playbook and dashboards to help with ODF performance analysis
GNU General Public License v3.0
5 stars 11 forks source link

OCP/ODF 4.12 Cannot read from prometheus #12

Closed kdvalin closed 1 year ago

kdvalin commented 1 year ago

On OCP/ODF 4.12 it seems like there is some sort of update to service account token creation that breaks this process.

Current workaround I've found is:

  1. Run oc get secret -n openshift-monitoring | grep prometheus-k8s-token and copy the first column
  2. Run oc get secret -n openshift-monitoring <paste output from step 1> -o json | jq -r .data.token | base64 -d and copy the result
  3. Go into the grafana datasource page, and reset the "Authorization" header contents
  4. Set the "Authorization" header value to Bearer <value from step 2> (space is important)
  5. Save & Test
kdvalin commented 1 year ago

After digging around a little more, I found the issue.

There is no cluster-monitoring-view in current release candidates of ODF 4.12.

Changing cluster-monitoring-view in roles/datasource/tasks/main.yml to prometheus-k8s restores expected functionality. Will open a PR, but additional exploration will be needed for other versions of OCP.

pcuzner commented 1 year ago

@kdvalin is this still the case? Looking at https://docs.openshift.com/container-platform/4.12/monitoring/managing-alerts.html it appears the role is still thee in 4.12?

kdvalin commented 1 year ago

This does seem to be the case for the OCP 4.12 version I'm currently running.

It seems like they renamed the role to openshift-cluster-monitoring-view, testing that now

kdvalin commented 1 year ago

Update: the role is still in OCP 4.12, but they removed the service account token association from within the service account's JSON, causing no token to be selected. Current PR does fix this, just want to verify everything before opening it up for merge.

pcuzner commented 1 year ago

@kdvalin I pushed some changes today which should fix this issue. I tested on a 4.12 system and everything appeared to work fine.

kdvalin commented 1 year ago

Sorry I forgot to respond to this.

The changes have fixed the issue, closing this out.