grafana / grafana-operator

An operator for Grafana that installs and manages Grafana instances, Dashboards and Datasources through Kubernetes/OpenShift CRs
https://grafana.github.io/grafana-operator/
Apache License 2.0
913 stars 397 forks source link

[Bug] Operator continues to sync even when resyncPeriod is set to 0m #1682

Closed Brandon-Kimberly closed 1 month ago

Brandon-Kimberly commented 1 month ago

Describe the bug I've installed the Grafana Operator helm chart to my Kubernetes cluster supplying the value for resyncPeriod as 0m. According to the documentation here this should disable the syncing functionality entirely:

If you never want the operator to poll for changes in the dashboards you need to set this value to 0m

Version v5.6.0

To Reproduce Steps to reproduce the behavior:

Install Grafana Operator helm chart v5.6.0 and supply the value resyncPeriod: 0m.

Expected behavior I expect that the dashboards are created once and never synced again. However, I can see when I get the grafanadashboards in my Kubernetes cluster that they are continuing to sync. See (notice Last Resync is far sooner than the age, and has only stopped due to the API key no longer being active):

kubectl get grafanadashboard -n grafana-operator  
NAME               NO MATCHING INSTANCES   LAST RESYNC   AGE
a-grafanadashboard                                  33h           4d21h
b-grafanadashboard                                  33h           4d21h
c-grafanadashboard                                  33h           4d21h
d-grafanadashboard                                  33h           4d21h
e-grafanadashboard                                  33h           4d21h
f-grafanadashboard                                  33h           4d21h
g-grafanadashboard                                  33h           4d21h

Suspect component/Location where the bug might be occurring Unknown

Runtime (please complete the following information):

Additional context Here is the full Helm chart (you can see resyncPeriod: 0m in USER-SUPPLIED VALUES and in COMPUTED VALUES:

helm -n grafana-operator get all  grafana-operator
NAME: grafana-operator
LAST DEPLOYED: Fri Sep 13 18:29:32 2024
NAMESPACE: grafana-operator
STATUS: deployed
REVISION: 1
TEST SUITE: None
USER-SUPPLIED VALUES:
resyncPeriod: 0m

COMPUTED VALUES:
additionalLabels: {}
affinity: {}
env: []
fullnameOverride: ""
image:
  pullPolicy: IfNotPresent
  repository: ghcr.io/grafana/grafana-operator
  tag: ""
imagePullSecrets: []
leaderElect: false
metricsService:
  metricsPort: 9090
  type: ClusterIP
nameOverride: ""
namespaceScope: false
nodeSelector: {}
podAnnotations: {}
podSecurityContext: {}
priorityClassName: ""
resources: {}
resyncPeriod: 0m
securityContext:
  capabilities:
    drop:
    - ALL
  readOnlyRootFilesystem: true
  runAsNonRoot: true
serviceAccount:
  annotations: {}
  create: true
  name: ""
tolerations: []
watchNamespaces: ""

HOOKS:
MANIFEST:
---
# Source: grafana-operator/templates/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: grafana-operator
  namespace: "grafana-operator"
  labels:
    helm.sh/chart: grafana-operator-v5.6.0
    app.kubernetes.io/name: grafana-operator
    app.kubernetes.io/instance: grafana-operator
    app.kubernetes.io/version: "v5.6.0"
    app.kubernetes.io/managed-by: Helm
automountServiceAccountToken: true
---
# Source: grafana-operator/templates/rbac.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: grafana-operator-permissions
  labels:
    helm.sh/chart: grafana-operator-v5.6.0
    app.kubernetes.io/name: grafana-operator
    app.kubernetes.io/instance: grafana-operator
    app.kubernetes.io/version: "v5.6.0"
    app.kubernetes.io/managed-by: Helm
rules:
  - apiGroups:
      - ""
    resources:
      - configmaps
    verbs:
      - get
      - list
      - watch
      - create
      - update
      - patch
      - delete
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs:
      - get
      - list
      - watch
      - create
      - update
      - patch
      - delete
  - apiGroups:
      - ""
    resources:
      - events
    verbs:
      - create
      - patch
  - apiGroups:
      - ""
    resources:
      - configmaps
      - persistentvolumeclaims
      - secrets
      - serviceaccounts
      - services
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch
  - apiGroups:
      - ""
    resources:
      - events
    verbs:
      - create
      - get
      - list
      - patch
      - watch
  - apiGroups:
      - apps
    resources:
      - deployments
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch
  - apiGroups:
      - grafana.integreatly.org
    resources:
      - grafanadashboards
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch
  - apiGroups:
      - grafana.integreatly.org
    resources:
      - grafanadashboards/finalizers
    verbs:
      - update
  - apiGroups:
      - grafana.integreatly.org
    resources:
      - grafanadashboards/status
    verbs:
      - get
      - patch
      - update
  - apiGroups:
      - grafana.integreatly.org
    resources:
      - grafanadatasources
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch
  - apiGroups:
      - grafana.integreatly.org
    resources:
      - grafanadatasources/finalizers
    verbs:
      - update
  - apiGroups:
      - grafana.integreatly.org
    resources:
      - grafanadatasources/status
    verbs:
      - get
      - patch
      - update
  - apiGroups:
      - grafana.integreatly.org
    resources:
      - grafanafolders
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch
  - apiGroups:
      - grafana.integreatly.org
    resources:
      - grafanafolders/finalizers
    verbs:
      - update
  - apiGroups:
      - grafana.integreatly.org
    resources:
      - grafanafolders/status
    verbs:
      - get
      - patch
      - update
  - apiGroups:
      - grafana.integreatly.org
    resources:
      - grafanas
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch
  - apiGroups:
      - grafana.integreatly.org
    resources:
      - grafanas/finalizers
    verbs:
      - update
  - apiGroups:
      - grafana.integreatly.org
    resources:
      - grafanas/status
    verbs:
      - get
      - patch
      - update
  - apiGroups:
      - networking.k8s.io
    resources:
      - ingresses
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch
  - apiGroups:
      - route.openshift.io
    resources:
      - routes
      - routes/custom-host
    verbs:
      - create
      - delete
      - get
      - list
      - update
      - watch
  - apiGroups:
      - authentication.k8s.io
    resources:
      - tokenreviews
    verbs:
      - create
  - apiGroups:
      - authorization.k8s.io
    resources:
      - subjectaccessreviews
    verbs:
      - create
---
# Source: grafana-operator/templates/rbac.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: grafana-operator-permissions
  labels:
    helm.sh/chart: grafana-operator-v5.6.0
    app.kubernetes.io/name: grafana-operator
    app.kubernetes.io/instance: grafana-operator
    app.kubernetes.io/version: "v5.6.0"
    app.kubernetes.io/managed-by: Helm
subjects:
  - kind: ServiceAccount
    name: grafana-operator
    namespace: grafana-operator
roleRef:
  kind: ClusterRole
  name: grafana-operator-permissions
  apiGroup: rbac.authorization.k8s.io
---
# Source: grafana-operator/templates/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: grafana-operator-metrics-service
  namespace: "grafana-operator"
  labels:
    helm.sh/chart: grafana-operator-v5.6.0
    app.kubernetes.io/name: grafana-operator
    app.kubernetes.io/instance: grafana-operator
    app.kubernetes.io/version: "v5.6.0"
    app.kubernetes.io/managed-by: Helm
spec:
  type: ClusterIP
  ports:
    - port: 9090
      targetPort: metrics
      protocol: TCP
      name: metrics
  selector:
    app.kubernetes.io/name: grafana-operator
    app.kubernetes.io/instance: grafana-operator
---
# Source: grafana-operator/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana-operator
  namespace: "grafana-operator"
  labels:
    helm.sh/chart: grafana-operator-v5.6.0
    app.kubernetes.io/name: grafana-operator
    app.kubernetes.io/instance: grafana-operator
    app.kubernetes.io/version: "v5.6.0"
    app.kubernetes.io/managed-by: Helm
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: grafana-operator
      app.kubernetes.io/instance: grafana-operator
  template:
    metadata:
      labels:
        app.kubernetes.io/name: grafana-operator
        app.kubernetes.io/instance: grafana-operator
    spec:
      serviceAccountName: grafana-operator
      containers:
        - name: grafana-operator
          securityContext:
            capabilities:
              drop:
              - ALL
            readOnlyRootFilesystem: true
            runAsNonRoot: true
          image: "ghcr.io/grafana/grafana-operator:v5.6.0"
          imagePullPolicy: IfNotPresent
          env:
            - name: WATCH_NAMESPACE
              value: 
          args:
            - --health-probe-bind-address=:8081
            - --metrics-bind-address=0.0.0.0:9090
          volumeMounts:
            - name: dashboards-dir
              mountPath: /tmp/dashboards
          ports:
            - containerPort: 9090
              name: metrics
              protocol: TCP
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8081
          readinessProbe:
            httpGet:
              path: /readyz
              port: 8081
      volumes:
        - name: dashboards-dir
          emptyDir: {}
Brandon-Kimberly commented 1 month ago

Can someone please help me understand this? The documentation suggests that this is all I should need to do to disable the syncing mechanism but it does not appear to be working as advertised.

theSuess commented 1 month ago

The resync period needs to be set on the resource (GrafanaDashboard,GrafanaFolder,GrafanaAlertRuleGroup, etc) and not in the helm values.

Setting the resync interval to 0 will prevent the operator from periodically resyncing on its own, but it will still sync the resource in two cases:

Can you further elaborate on your use case of "initial sync only" dashboards?

Brandon-Kimberly commented 1 month ago

Thanks for the response! I would like for the Grafana Operator to make a single call to the Grafana API to create the dashboard. After the dashboard has been initially created, I don't want the operator to continue to sync. How can I achieve this?

Brandon-Kimberly commented 1 month ago

I am using Flux to sync the dashboard definitions if that could account for this resyncing as the resource may technically be "updated" from time to time.

theSuess commented 1 month ago

We discussed this in our weekly sync call and don't think this is a behavior we want to encourage.

The intended use case of setting resyncPeriod: 0m is to still apply changes when the resource changes. For an operator, there is no way to detect if the reconciliation happened due to a restart or resource change.

We'll update the documentation to communicate this more clearly.

For your usecase, you could take a look at grizzly or the grafana terraform provider to accomplish the pattern of initial sync only