nerc-project / operations

Issues related to the operation of the NERC OpenShift environment
2 stars 0 forks source link

How do we manage custom dashboards? #54

Closed larsks closed 9 months ago

larsks commented 1 year ago

If we create custom dashboards, how do we save those configurations so that they are not lost in the even of a cluster rebuild or grafana reinstall? Can we store them in a git repository and have them deployed automatically?

includes @jbasu01 @harshil-codes

computate commented 1 year ago

We want to be able to create the custom dashboards defined here in the ACM docs, but as infrastructure as code in the nerc-ocp-config repo.

computate commented 1 year ago

We want to include the OpenShift Logging Collection dashboard from the default Observe -> Dashboards in OpenShift for logging performance.

computate commented 1 year ago

@harshilcodes suggested we use the dashboard configmaps in openshift-config-managed namespace to build our custom dashbaords in ACM. We need to make sure it can handle multiple clusters instead of local-cluster.

computate commented 1 year ago

@jbasu01 @harshil-codes We will want to add back this grafana directory to nerc-ocp-config, but move the Grafana instance into a different namespace besides openshift-logging.

apiVersion: integreatly.org/v1alpha1
kind: GrafanaDataSource
metadata:
  name: loki-datasource
  namespace: ...
spec:
  datasources:
    - access: proxy
      editable: false
      isDefault: true
      jsonData:
        httpHeaderName1: Authorization
        timeInterval: 5s
        tlsSkipVerify: true
      name: Prometheus
      secureJsonData:
        httpHeaderValue1: >-
          Bearer ...
      type: prometheus
      url: 'https://thanos-querier.openshift-monitoring.svc.cluster.local:9091/'
  name: custom-observability-grafana

We will need to handle the Bearer token special, like in our Vault.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: grafana-serviceaccount-cluster-monitoring-view
roleRef:
  kind: ClusterRole
  apiGroup: rbac.authorization.k8s.io
  name: cluster-monitoring-view
subjects:
  - kind: ServiceAccount
    name: grafana-serviceaccount
    namespace: ...
jbasu01 commented 1 year ago

Custom Grafana Dashboard

Last week we successfully created and tested a custom Grafana dasboard. That exercise helped in developing the following three options: Option 1 - create three Grafana instances, one out of the box providing the standard metrics, a second one for Administrators with "editor" rights and a third one with "read-only" access and datasources loaded for development and testing purposes. Option 2 - Use the out of the box ACM Grafana instance, and this one would be the "read-only" instance. Create a separate custom dashboard and add that to the out of the box ACM instance. (this needs to be validated) Option 3 - have 2 Grafana instances, grafana-dev and grafana-prod in the same namespace, and ignore the instance that comes with out of the box ACM.

For all three options, we might be able to use the storage that comes with the out of the box ACM as the datasource for Prometheus.

Option #3 might be the best approach, and we could collectively discuss and decide the next steps during our upcoming meeting.

computate commented 1 year ago

@jbasu01 Based on our testing today, we can successfully add any dashboard to the read-only ACM Observability Grafana instance. Let's put final dashboards into ConfigMaps in the open-cluster-management-observability namespace, and create a new grafana namespace on the new nerc-ocp-obs cluster with the Grafana Operator for creating new dashboards in development.

schwesig commented 10 months ago

status: @jbasu01 & @schwesig on regular meetings 1on1

schwesig commented 9 months ago

research/finding solution is done here, see follow up in https://github.com/nerc-project/operations/issues/326