Kubernetes and OpenShift Backup Operator
Apache License 2.0
`2.7.0`: `cannot update resource "roles" in API group "rbac.authorization.k8s.io"` #834

Closed 9numbernine9 closed 1 year ago

9numbernine9 commented 1 year ago


Hello! 👋

Since upgrading to 2.7.0 (deployed via Helm chart 4.2.0) the k8up controller generates the following errors whenever it prepares to run a backup:

2023-04-01T16:12:51Z    ERROR   k8up.operator   Reconciler error    {"controller": "backup.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Backup", "Backup": {"name":"example-scheduled-backup-backup-46g7j","namespace":"example"}, "namespace": "example", "name": "example-scheduled-backup-backup-46g7j", "reconcileID": "3b8b7c9b-6ede-4389-957b-cd9525db80d4", "error": "roles.rbac.authorization.k8s.io \"pod-executor\" is forbidden: User \"system:serviceaccount:k8up:k8up-io\" cannot update resource \"roles\" in API group \"rbac.authorization.k8s.io\" in the namespace \"example\""}

My backup configuration previously worked fine with 2.6.x and Helm chart 4.1.0. I have reinstalled the CRDs as per the release notes but this doesn't seem to make a difference; it seems like something no longer has the correct permissions to execute the backup.

Rolling back to 2.6.0 (Helm 4.1.0) works fine.

I'm using a ScheduledBackup that looks like this:

apiVersion: k8up.io/v1
kind: Schedule
  name: example-scheduled-backup
  namespace: example
    repoInitialize: false
      url: https://rest-endpoint-goes-here
    schedule: '@daily-random'

Additional Context

Kubernetes 1.26.3 (K3S 1.26.3+k3s1)


Expected Behavior

Backups in 2.7.0 continue to work. 😄

Steps To Reproduce

Version of K8up


Version of Kubernetes


Distribution of Kubernetes

K3S (1.26.3+k3s1)

Kidswiss commented 1 year ago

Hi @9numbernine9

Thanks for reporting this!

This is odd, the clusterrole that gives the K8up SA it's permissions hasn't been touched between the helm chart 4.1.0 and 4.2.0.

After the update, do the k8up-edit, k8up-manager and k8up-view clusterroles still exist?

Can you also post the values.yaml you're using to deploy K8up?

9numbernine9 commented 1 year ago

HI @Kidswiss !

The k8up-edit, k8up-manager and k8up-view cluster roles are still there. (Just for fun, I uninstalled and reinstalled the Helm chart and verified that they were re-created.) The cluster roles that are created look like this:

Cluster Roles ```yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: k8up-io-edit uid: 1484bf75-f353-4d5d-8a96-ca6d3c050454 resourceVersion: '64483094' creationTimestamp: '2023-04-01T02:45:25Z' labels: app.kubernetes.io/instance: k8up-io app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: k8up rbac.authorization.k8s.io/aggregate-to-admin: 'true' rbac.authorization.k8s.io/aggregate-to-edit: 'true' annotations: meta.helm.sh/release-name: k8up-io meta.helm.sh/release-namespace: k8up managedFields: - manager: terraform-provider-helm_v2.9.0_x5 operation: Update apiVersion: rbac.authorization.k8s.io/v1 time: '2023-04-01T02:45:25Z' fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:meta.helm.sh/release-name: {} f:meta.helm.sh/release-namespace: {} f:labels: .: {} f:app.kubernetes.io/instance: {} f:app.kubernetes.io/managed-by: {} f:app.kubernetes.io/name: {} f:rbac.authorization.k8s.io/aggregate-to-admin: {} f:rbac.authorization.k8s.io/aggregate-to-edit: {} f:rules: {} selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/k8up-io-edit rules: - verbs: - create - delete - get - list - patch - update - watch apiGroups: - k8up.io resources: - '*' --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: k8up-io-manager uid: dc06a162-b2e8-4b0c-9698-d1de35b3cdae resourceVersion: '66519744' creationTimestamp: '2023-04-01T02:45:25Z' labels: app.kubernetes.io/instance: k8up-io app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: k8up helm.sh/chart: k8up-4.2.0 annotations: meta.helm.sh/release-name: k8up-io meta.helm.sh/release-namespace: k8up managedFields: - manager: terraform-provider-helm_v2.9.0_x5 operation: Update apiVersion: rbac.authorization.k8s.io/v1 time: '2023-04-03T10:13:36Z' fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:meta.helm.sh/release-name: {} f:meta.helm.sh/release-namespace: {} f:labels: .: {} f:app.kubernetes.io/instance: {} f:app.kubernetes.io/managed-by: {} f:app.kubernetes.io/name: {} f:helm.sh/chart: {} f:rules: {} selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/k8up-io-manager rules: - verbs: - create - delete - get - list - patch - update - watch apiGroups: - apps resources: - deployments - verbs: - create - delete - get - list - patch - update - watch apiGroups: - batch resources: - jobs - verbs: - create - get - list - update apiGroups: - coordination.k8s.io resources: - leases - verbs: - create - patch apiGroups: - '' resources: - events - verbs: - get - list - watch apiGroups: - '' resources: - persistentvolumeclaims - verbs: - get - list - watch apiGroups: - '' resources: - persistentvolumes - verbs: - '*' apiGroups: - '' resources: - pods - verbs: - '*' apiGroups: - '' resources: - pods/exec - verbs: - create - delete - get - list - watch apiGroups: - '' resources: - serviceaccounts - verbs: - create - delete - get - list - patch - update - watch apiGroups: - k8up.io resources: - archives - verbs: - get - patch - update apiGroups: - k8up.io resources: - archives/finalizers - archives/status - verbs: - create - delete - get - list - patch - update - watch apiGroups: - k8up.io resources: - backups - verbs: - get - patch - update apiGroups: - k8up.io resources: - backups/finalizers - backups/status - verbs: - create - delete - get - list - patch - update - watch apiGroups: - k8up.io resources: - checks - verbs: - get - patch - update apiGroups: - k8up.io resources: - checks/finalizers - checks/status - verbs: - create - delete - get - list - patch - update - watch apiGroups: - k8up.io resources: - effectiveschedules - verbs: - update apiGroups: - k8up.io resources: - effectiveschedules/finalizers - verbs: - create - delete - get - list - patch - update - watch apiGroups: - k8up.io resources: - prebackuppods - verbs: - get - patch - update apiGroups: - k8up.io resources: - prebackuppods/finalizers - prebackuppods/status - verbs: - create - delete - get - list - patch - update - watch apiGroups: - k8up.io resources: - prunes - verbs: - get - patch - update apiGroups: - k8up.io resources: - prunes/finalizers - prunes/status - verbs: - create - delete - get - list - patch - update - watch apiGroups: - k8up.io resources: - restores - verbs: - get - patch - update apiGroups: - k8up.io resources: - restores/finalizers - restores/status - verbs: - create - delete - get - list - patch - update - watch apiGroups: - k8up.io resources: - schedules - verbs: - get - patch - update apiGroups: - k8up.io resources: - schedules/finalizers - schedules/status - verbs: - create - delete - get - list - patch - update - watch apiGroups: - k8up.io resources: - snapshots - verbs: - get - patch - update apiGroups: - k8up.io resources: - snapshots/finalizers - snapshots/status - verbs: - create - delete - get - list - watch apiGroups: - rbac.authorization.k8s.io resources: - rolebindings - roles --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: k8up-io-view uid: 5deb8b9b-a380-4ebc-a59f-1cb8c5f3ac84 resourceVersion: '64483095' creationTimestamp: '2023-04-01T02:45:25Z' labels: app.kubernetes.io/instance: k8up-io app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: k8up rbac.authorization.k8s.io/aggregate-to-view: 'true' annotations: meta.helm.sh/release-name: k8up-io meta.helm.sh/release-namespace: k8up managedFields: - manager: terraform-provider-helm_v2.9.0_x5 operation: Update apiVersion: rbac.authorization.k8s.io/v1 time: '2023-04-01T02:45:25Z' fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:meta.helm.sh/release-name: {} f:meta.helm.sh/release-namespace: {} f:labels: .: {} f:app.kubernetes.io/instance: {} f:app.kubernetes.io/managed-by: {} f:app.kubernetes.io/name: {} f:rbac.authorization.k8s.io/aggregate-to-view: {} f:rules: {} selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/k8up-io-view rules: - verbs: - get - list - watch apiGroups: - k8up.io resources: - '*' ```

The values.yaml file I'm providing is pretty basic (I'm deploying this with Terraform FWIW):

        key: RESTIC_PASSWORD
        name: restic
Kidswiss commented 1 year ago

The RBAC rules look okay.

Could you please also check if the serviceaccount on the K8up pod has an actual clusterrolebinding on the k8up-io-manager clusterrole?

9numbernine9 commented 1 year ago

Hmm, I'm not an expert at Kubernetes permissions but here we go. 😄

The pod refers to the k8up-io ServiceAccount:

Pod Description ```yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: "2023-04-03T12:27:10Z" generateName: k8up-io-67f65dd6cf- labels: app.kubernetes.io/instance: k8up-io app.kubernetes.io/name: k8up pod-template-hash: 67f65dd6cf name: k8up-io-67f65dd6cf-v99k4 namespace: k8up ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: ReplicaSet name: k8up-io-67f65dd6cf uid: e02f9271-9f02-4133-905a-9440a178efe1 resourceVersion: "66601715" uid: 7fa0d2cd-8a63-4ece-8c9c-1fe1b03f2be5 spec: containers: - args: - operator env: - name: BACKUP_IMAGE value: ghcr.io/k8up-io/k8up:v2.7.0 - name: BACKUP_ENABLE_LEADER_ELECTION value: "true" - name: BACKUP_OPERATOR_NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace - name: BACKUP_GLOBALREPOPASSWORD valueFrom: secretKeyRef: key: RESTIC_PASSWORD name: restic image: ghcr.io/k8up-io/k8up:v2.7.0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /metrics port: http scheme: HTTP initialDelaySeconds: 30 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: k8up-operator ports: - containerPort: 8080 name: http protocol: TCP resources: limits: memory: 256Mi requests: cpu: 20m memory: 128Mi securityContext: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-7ppbj readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true nodeName: k3s01 preemptionPolicy: PreemptLowerPriority priority: 0 restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: k8up-io serviceAccountName: k8up-io terminationGracePeriodSeconds: 30 tolerations: - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 volumes: - name: kube-api-access-7ppbj projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace status: conditions: - lastProbeTime: null lastTransitionTime: "2023-04-03T12:27:10Z" status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: "2023-04-03T12:27:12Z" status: "True" type: Ready - lastProbeTime: null lastTransitionTime: "2023-04-03T12:27:12Z" status: "True" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2023-04-03T12:27:10Z" status: "True" type: PodScheduled containerStatuses: - containerID: containerd://1c4dc9bcb777be65d520ac51296df502040ce6fd2f33ababe42677abb7058527 image: ghcr.io/k8up-io/k8up:v2.7.0 imageID: ghcr.io/k8up-io/k8up@sha256:e73dc644d9af02d0093905bff8591cd9039a0ad5058c534c421deb3a466f6610 lastState: {} name: k8up-operator ready: true restartCount: 0 started: true state: running: startedAt: "2023-04-03T12:27:12Z" hostIP: phase: Running podIP: podIPs: - ip: - ip: fd00::bb qosClass: Burstable startTime: "2023-04-03T12:27:10Z" ```

The k8up-io ServiceAccount is defined as:

apiVersion: v1
kind: ServiceAccount
    meta.helm.sh/release-name: k8up-io
    meta.helm.sh/release-namespace: k8up
  creationTimestamp: "2023-04-03T10:16:51Z"
    app.kubernetes.io/instance: k8up-io
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: k8up
    helm.sh/chart: k8up-4.2.0
  name: k8up-io
  namespace: k8up
  resourceVersion: "66601651"
  uid: 48d18c16-edfe-47cb-9f4f-6766675e3a80

The ClusterRoleBinding is defined as:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
    meta.helm.sh/release-name: k8up-io
    meta.helm.sh/release-namespace: k8up
  creationTimestamp: "2023-04-03T10:16:51Z"
    app.kubernetes.io/instance: k8up-io
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: k8up
    helm.sh/chart: k8up-4.2.0
  name: k8up-io
  resourceVersion: "66601661"
  uid: 7311f521-fdf0-4461-bdae-27410e41f5b3
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: k8up-io-manager
- kind: ServiceAccount
  name: k8up-io
  namespace: k8up

And the k8up-io-manager ClusterRole:

k8up-io-manager ```yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: annotations: meta.helm.sh/release-name: k8up-io meta.helm.sh/release-namespace: k8up creationTimestamp: "2023-04-03T10:16:51Z" labels: app.kubernetes.io/instance: k8up-io app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: k8up helm.sh/chart: k8up-4.2.0 name: k8up-io-manager resourceVersion: "66601654" uid: 61f2aab4-14a3-47c7-b2e3-dda26e328224 rules: - apiGroups: - apps resources: - deployments verbs: - create - delete - get - list - patch - update - watch - apiGroups: - batch resources: - jobs verbs: - create - delete - get - list - patch - update - watch - apiGroups: - coordination.k8s.io resources: - leases verbs: - create - get - list - update - apiGroups: - "" resources: - events verbs: - create - patch - apiGroups: - "" resources: - persistentvolumeclaims verbs: - get - list - watch - apiGroups: - "" resources: - persistentvolumes verbs: - get - list - watch - apiGroups: - "" resources: - pods verbs: - '*' - apiGroups: - "" resources: - pods/exec verbs: - '*' - apiGroups: - "" resources: - serviceaccounts verbs: - create - delete - get - list - watch - apiGroups: - k8up.io resources: - archives verbs: - create - delete - get - list - patch - update - watch - apiGroups: - k8up.io resources: - archives/finalizers - archives/status verbs: - get - patch - update - apiGroups: - k8up.io resources: - backups verbs: - create - delete - get - list - patch - update - watch - apiGroups: - k8up.io resources: - backups/finalizers - backups/status verbs: - get - patch - update - apiGroups: - k8up.io resources: - checks verbs: - create - delete - get - list - patch - update - watch - apiGroups: - k8up.io resources: - checks/finalizers - checks/status verbs: - get - patch - update - apiGroups: - k8up.io resources: - effectiveschedules verbs: - create - delete - get - list - patch - update - watch - apiGroups: - k8up.io resources: - effectiveschedules/finalizers verbs: - update - apiGroups: - k8up.io resources: - prebackuppods verbs: - create - delete - get - list - patch - update - watch - apiGroups: - k8up.io resources: - prebackuppods/finalizers - prebackuppods/status verbs: - get - patch - update - apiGroups: - k8up.io resources: - prunes verbs: - create - delete - get - list - patch - update - watch - apiGroups: - k8up.io resources: - prunes/finalizers - prunes/status verbs: - get - patch - update - apiGroups: - k8up.io resources: - restores verbs: - create - delete - get - list - patch - update - watch - apiGroups: - k8up.io resources: - restores/finalizers - restores/status verbs: - get - patch - update - apiGroups: - k8up.io resources: - schedules verbs: - create - delete - get - list - patch - update - watch - apiGroups: - k8up.io resources: - schedules/finalizers - schedules/status verbs: - get - patch - update - apiGroups: - k8up.io resources: - snapshots verbs: - create - delete - get - list - patch - update - watch - apiGroups: - k8up.io resources: - snapshots/finalizers - snapshots/status verbs: - get - patch - update - apiGroups: - rbac.authorization.k8s.io resources: - rolebindings - roles verbs: - create - delete - get - list - watch ```
Kidswiss commented 1 year ago

The RBAC looks good.

Unfortunately I was not able to reproduce the issue locally. Which makes it hard to figure out what's wrong...

Some things you could also check:

9numbernine9 commented 1 year ago

Hi @Kidswiss - apologies for not writing back sooner.

To answer your questions:

I've been trying to find some time to replicate my issue from a clean slate, and I didn't have time until this weekend. With that said: I've tried again a few times from a clean slate each time, and the steps below are how I've been able to replicate it.

❯ kubectl create ns k8up
namespace/k8up created
❯ cat restic.yaml
apiVersion: v1
  RESTIC_PASSWORD: <base64-password goes here>
kind: Secret
  name: restic
  namespace: k8up
type: Opaque

❯ kubectl apply -f restic.yaml
secret/restic created
❯ kubectl apply -f https://github.com/k8up-io/k8up/releases/download/k8up-4.2.0/k8up-crd.yaml
customresourcedefinition.apiextensions.k8s.io/archives.k8up.io created
customresourcedefinition.apiextensions.k8s.io/backups.k8up.io created
customresourcedefinition.apiextensions.k8s.io/checks.k8up.io created
customresourcedefinition.apiextensions.k8s.io/prebackuppods.k8up.io created
customresourcedefinition.apiextensions.k8s.io/prunes.k8up.io created
customresourcedefinition.apiextensions.k8s.io/restores.k8up.io created
customresourcedefinition.apiextensions.k8s.io/schedules.k8up.io created
customresourcedefinition.apiextensions.k8s.io/snapshots.k8up.io created
❯ helm repo add k8up-io https://k8up-io.github.io/k8up
"k8up-io" has been added to your repositories

❯ cat values.yaml
        key: RESTIC_PASSWORD
        name: restic

❯ helm install k8up k8up-io/k8up --namespace k8up --values values.yaml
NAME: k8up
LAST DEPLOYED: Thu Apr  6 06:27:27 2023
STATUS: deployed
! Attention !

This Helm chart does not include CRDs.
Please make sure you have installed or upgraded the necessary CRDs as instructed in the Chart README.

❯ kubectl get pods -n k8up
NAME                    READY   STATUS    RESTARTS   AGE
k8up-7dc796f59d-n8pbx   1/1     Running   0          22h

❯ kubectl describe pod -n k8up k8up-7dc796f59d-n8pbx
Name:             k8up-7dc796f59d-n8pbx
Namespace:        k8up
Priority:         0
Service Account:  k8up
Node:             k3s01/
Start Time:       Sun, 09 Apr 2023 11:29:52 -0400
Labels:           app.kubernetes.io/instance=k8up
Annotations:      <none>
Status:           Running
  IP:           fd00::16
Controlled By:  ReplicaSet/k8up-7dc796f59d
    Container ID:  containerd://a1f3067cd3ee260f2bcf35d544f75900efc863cf144f40afe79fb1630a521ac3
    Image:         ghcr.io/k8up-io/k8up:v2.7.0
    Image ID:      ghcr.io/k8up-io/k8up@sha256:e73dc644d9af02d0093905bff8591cd9039a0ad5058c534c421deb3a466f6610
    Port:          8080/TCP
    Host Port:     0/TCP
    State:          Running
      Started:      Sun, 09 Apr 2023 11:29:54 -0400
    Ready:          True
    Restart Count:  0
      memory:  256Mi
      cpu:     20m
      memory:  128Mi
    Liveness:  http-get http://:http/metrics delay=30s timeout=1s period=10s #success=1 #failure=3
      BACKUP_IMAGE:                   ghcr.io/k8up-io/k8up:v2.7.0
      BACKUP_OPERATOR_NAMESPACE:      k8up (v1:metadata.namespace)
      BACKUP_GLOBALREPOPASSWORD:      <set to the key 'RESTIC_PASSWORD' in secret 'restic'>  Optional: false
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tn99b (ro)
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>
❯ cat schedule.yaml
apiVersion: k8up.io/v1
kind: Schedule
  name: example-scheduled-backup
  namespace: example
      url: https://restic-url-goes-here.com
    schedule: '*/5 0 * * *'

❯ k apply -f schedule.yaml
schedule.k8up.io/example-scheduled-backup created
2023-04-09T15:29:54Z    INFO    k8up    Starting k8up…  {"version": "2.7.0", "date": "2023-03-30T12:16:39Z", "commit": "8f203d75eaa6826405e0c288d2d0915fa6c53e79", "go_os": "linux", "go_arch": "amd64", "go_version": "go1.19.7", "uid": 65532, "gid": 0}
2023-04-09T15:29:54Z    INFO    k8up.operator   initializing
2023-04-09T15:29:54Z    INFO    k8up.operator.controller-runtime.metrics    Metrics server is starting to listen    {"addr": ":8080"}
2023-04-09T15:29:54Z    INFO    k8up.operator   Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::]:8080"}
I0409 15:29:54.783793       1 leaderelection.go:248] attempting to acquire leader lease k8up/d2ab61da.syn.tools...
I0409 15:30:10.868721       1 leaderelection.go:258] successfully acquired lease k8up/d2ab61da.syn.tools
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting EventSource    {"controller": "archive.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Archive", "source": "kind source: *v1.Archive"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting EventSource    {"controller": "archive.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Archive", "source": "kind source: *v1.Job"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting Controller {"controller": "archive.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Archive"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting EventSource    {"controller": "restore.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Restore", "source": "kind source: *v1.Restore"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting EventSource    {"controller": "prune.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Prune", "source": "kind source: *v1.Prune"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting EventSource    {"controller": "restore.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Restore", "source": "kind source: *v1.Job"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting EventSource    {"controller": "prune.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Prune", "source": "kind source: *v1.Job"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting Controller {"controller": "prune.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Prune"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting Controller {"controller": "restore.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Restore"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting EventSource    {"controller": "backup.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Backup", "source": "kind source: *v1.Backup"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting EventSource    {"controller": "check.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Check", "source": "kind source: *v1.Check"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting EventSource    {"controller": "check.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Check", "source": "kind source: *v1.Job"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting Controller {"controller": "check.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Check"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting EventSource    {"controller": "backup.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Backup", "source": "kind source: *v1.Job"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting Controller {"controller": "backup.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Backup"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting EventSource    {"controller": "schedule.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Schedule", "source": "kind source: *v1.Schedule"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting Controller {"controller": "schedule.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Schedule"}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting workers    {"controller": "prune.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Prune", "worker count": 1}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting workers    {"controller": "schedule.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Schedule", "worker count": 1}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting workers    {"controller": "check.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Check", "worker count": 1}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting workers    {"controller": "archive.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Archive", "worker count": 1}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting workers    {"controller": "restore.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Restore", "worker count": 1}
2023-04-09T15:30:10Z    INFO    k8up.operator   Starting workers    {"controller": "backup.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Backup", "worker count": 1}
2023-04-10T00:00:00Z    INFO    k8up.operator.scheduler Running schedule    {"cron": "*/5 0 * * *", "key": "example/example-scheduled-backup/backup"}
2023-04-10T00:00:00Z    ERROR   k8up.operator   Reconciler error    {"controller": "backup.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Backup", "Backup": {"name":"example-scheduled-backup-backup-vn47q","namespace":"example"}, "namespace": "example", "name": "example-scheduled-backup-backup-vn47q", "reconcileID": "2c412819-7c89-4c96-8560-7353ef8470e7", "error": "roles.rbac.authorization.k8s.io \"pod-executor\" is forbidden: User \"system:serviceaccount:k8up:k8up\" cannot update resource \"roles\" in API group \"rbac.authorization.k8s.io\" in the namespace \"example\""}
2023-04-10T00:00:00Z    ERROR   k8up.operator   Reconciler error    {"controller": "backup.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Backup", "Backup": {"name":"example-scheduled-backup-backup-vn47q","namespace":"example"}, "namespace": "example", "name": "example-scheduled-backup-backup-vn47q", "reconcileID": "a68fffbe-5842-4376-aafc-a8d331901db2", "error": "roles.rbac.authorization.k8s.io \"pod-executor\" is forbidden: User \"system:serviceaccount:k8up:k8up\" cannot update resource \"roles\" in API group \"rbac.authorization.k8s.io\" in the namespace \"example\""}
2023-04-10T00:00:00Z    ERROR   k8up.operator   Reconciler error    {"controller": "backup.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Backup", "Backup": {"name":"example-scheduled-backup-backup-vn47q","namespace":"example"}, "namespace": "example", "name": "example-scheduled-backup-backup-vn47q", "reconcileID": "b0a1495d-5feb-4f3d-b30d-e8ee19edba78", "error": "roles.rbac.authorization.k8s.io \"pod-executor\" is forbidden: User \"system:serviceaccount:k8up:k8up\" cannot update resource \"roles\" in API group \"rbac.authorization.k8s.io\" in the namespace \"example\""}
2023-04-10T00:00:00Z    ERROR   k8up.operator   Reconciler error    {"controller": "backup.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Backup", "Backup": {"name":"example-scheduled-backup-backup-vn47q","namespace":"example"}, "namespace": "example", "name": "example-scheduled-backup-backup-vn47q", "reconcileID": "17c7bb88-626c-4993-8edd-bd17f00af2e3", "error": "roles.rbac.authorization.k8s.io \"pod-executor\" is forbidden: User \"system:serviceaccount:k8up:k8up\" cannot update resource \"roles\" in API group \"rbac.authorization.k8s.io\" in the namespace \"example\""}
2023-04-10T00:00:00Z    ERROR   k8up.operator   Reconciler error    {"controller": "backup.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Backup", "Backup": {"name":"example-scheduled-backup-backup-vn47q","namespace":"example"}, "namespace": "example", "name": "example-scheduled-backup-backup-vn47q", "reconcileID": "115903d7-fe33-4def-985a-46025b573978", "error": "roles.rbac.authorization.k8s.io \"pod-executor\" is forbidden: User \"system:serviceaccount:k8up:k8up\" cannot update resource \"roles\" in API group \"rbac.authorization.k8s.io\" in the namespace \"example\""}
2023-04-10T00:00:00Z    ERROR   k8up.operator   Reconciler error    {"controller": "backup.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Backup", "Backup": {"name":"example-scheduled-backup-backup-vn47q","namespace":"example"}, "namespace": "example", "name": "example-scheduled-backup-backup-vn47q", "reconcileID": "69231f9b-3f84-4925-a421-d6ef0e5d9e37", "error": "roles.rbac.authorization.k8s.io \"pod-executor\" is forbidden: User \"system:serviceaccount:k8up:k8up\" cannot update resource \"roles\" in API group \"rbac.authorization.k8s.io\" in the namespace \"example\""}
2023-04-10T00:00:00Z    ERROR   k8up.operator   Reconciler error    {"controller": "backup.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Backup", "Backup": {"name":"example-scheduled-backup-backup-vn47q","namespace":"example"}, "namespace": "example", "name": "example-scheduled-backup-backup-vn47q", "reconcileID": "1a98d22b-56ba-49c6-bcc0-809f8a09dfbb", "error": "roles.rbac.authorization.k8s.io \"pod-executor\" is forbidden: User \"system:serviceaccount:k8up:k8up\" cannot update resource \"roles\" in API group \"rbac.authorization.k8s.io\" in the namespace \"example\""}
2023-04-10T00:00:00Z    ERROR   k8up.operator   Reconciler error    {"controller": "backup.k8up.io", "controllerGroup": "k8up.io", "controllerKind": "Backup", "Backup": {"name":"example-scheduled-backup-backup-vn47q","namespace":"example"}, "namespace": "example", "name": "example-scheduled-backup-backup-vn47q", "reconcileID": "6e9f9f1e-6206-4c33-864e-f4420a39d20c", "error": "roles.rbac.authorization.k8s.io \"pod-executor\" is forbidden: User \"system:serviceaccount:k8up:k8up\" cannot update resource \"roles\" in API group \"rbac.authorization.k8s.io\" in the namespace \"example\""}

I hope that's helpful! As I mentioned in my original comment, I'm running Kubernetes 1.26.3 via K3S (v1.26.3+k3s1). Normally I manage the configure of my cluster using Terraform, but for the purposes of debugging I've installed and configured K8up manually to make sure that I'm not introducing some weird configuration via Terraform.

Kidswiss commented 1 year ago

This is where things are a bit odd, because even though I've created a schedule that should run every five minutes

The schedule you created will only run every 5 minutes past 0, running it every 5 minutes would be */5 * * * * :)

❯ kubectl describe pod -n k8up k8up-7dc796f59d-n8pbx
Name:             k8up-7dc796f59d-n8pbx
Namespace:        k8up
Priority:         0
Service Account:  k8up

Here the service account seems to be named k8up, but the rolebinding is for an account with the name k8up-io. Can you manually edit the k8up-io clusterrolebinding and change the referenced user to k8up, then restart the pod.

The name of the service account gets taken from the release name by default. So helm install k8up k8up-io/k8up --namespace k8up --values values.yaml should result in consistent naming of the service account and the bindings with the name k8up.

9numbernine9 commented 1 year ago

This is where things are a bit odd, because even though I've created a schedule that should run every five minutes

The schedule you created will only run every 5 minutes past 0, running it every 5 minutes would be */5 * * * * :)

Yup, I'm an idiot. I didn't even notice the 0. 😂

Here the service account seems to be named k8up, but the rolebinding is for an account with the name k8up-io. Can you manually edit the k8up-io clusterrolebinding and change the referenced user to k8up, then restart the pod.

I think the referenced account for the ClusterRoleBinding already is k8up?

❯ k describe clusterrolebinding k8up
Name:         k8up
Labels:       app.kubernetes.io/instance=k8up
Annotations:  meta.helm.sh/release-name: k8up
              meta.helm.sh/release-namespace: k8up
  Kind:  ClusterRole
  Name:  k8up-manager
  Kind            Name  Namespace
  ----            ----  ---------
  ServiceAccount  k8up  k8up
Kidswiss commented 1 year ago

Okay I'm starting to run out of ideas here :(

I've just tested k8up in a local k3d cluster with the exact same version you have, and I didn't run into any issues starting backups.

I've used this version:

k3d cluster create -i docker.io/rancher/k3s:v1.26.3-k3s1

For the installation I copy and pasted your exact commands from your comment. From what I've seen in your comments the RBAC and everything seems to be correct.

9numbernine9 commented 1 year ago

@Kidswiss I've done some more testing/debugging and here's what I've found so far:

This makes the error go away and the scheduled backups now complete successfully. I humbly admit that I don't know enough about K8S RBAC to know if this change is a good idea or not.

Kidswiss commented 1 year ago

Huh, interesting. Thanks for investigating!

I've just skimmed through the diff between 2.6 and 2.7. We're still doing the role logic the same way. The only thing that changed is that we added a new permission. That's where you issue comes from, and also why I was never able to reproduce it!

For K8up to be able to add the new permissions to an existing role, it actually needs this update permission. The bug never triggered for me as I've always tested against a fresh cluster.

So yes, the update is actually necessary, if people migrate from an older version. I'm opening a quick PR for that.

SchoolGuy commented 1 year ago

This hit me too today. @Kidswiss is an "emergency" release of the helmchart to 4.2.1 possible? In the meantime I will downgrade the operator to 2.6.0.

Kidswiss commented 1 year ago

Hi @SchoolGuy

I'm currently waiting for some more reviews in https://github.com/k8up-io/k8up/pull/852 to make the whole RBAC system a bit more secure as well. I hope to get it done soon.