fluent / fluent-operator

Operate Fluent Bit and Fluentd in the Kubernetes way - Previously known as FluentBit Operator
Apache License 2.0
588 stars 250 forks source link

help request: Why are cluster-wide permission required for some resources such as DaemonSets, Secrets & more? #1281

Open boatski opened 3 months ago

boatski commented 3 months ago

Describe the issue

I am looking to rollout Fluent Operator in an enterprise Kubernetes cluster to handle various telemetry needs for 100+ services. The configuration related CRDs look to allow our service owners to extend our telemetry pipeline for their own needs, such as additional outputs. That's the main reason Fluent Operator stood out for me.

Security is a concern with full cluster-wide permissions given on DaemonSets, StatefulSets, Secrets, ServiceAccounts, and more. Given that these are namespace scoped what functionality does the operator provide that requires cluster-wide permissions on these resources?

I've attempted to reduce the permission scope on these resources to a single namespace using Roles/RoleBindings, but the operator attempts to list/watch these resources across the entire cluster.

For example, I moved permissions below from the ClusterRole to a namespaced Role (with the appropriate bindings). If the operator exists in a single namespace, as well as either FluentBit or Fluentd, then I would not expect that it needs to monitor the entire cluster for these resources.

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: telemetry
  name: fluent-operator
rules:
  - apiGroups:
      - apps
    resources:
      - daemonsets
      - statefulsets
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch
  - apiGroups:
      - rbac.authorization.k8s.io
    resources:
      - clusterrolebindings
    verbs:
      - create
      - list
      - get
      - watch
      - patch
  - apiGroups:
      - rbac.authorization.k8s.io
    resources:
      - clusterroles
    verbs:
      - create
      - list
      - get
      - watch
      - patch
  - apiGroups:
      - ""
    resources:
      - secrets
      - configmaps
      - services
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch

The logs below are from the operator once those permissions were scoped to a single namespace.

fluent-operator W0802 20:15:54.123352       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: failed to list *v1.DaemonSet: daemonsets.apps is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "daemonsets" in API group "apps" at the cluster scope
fluent-operator E0802 20:15:54.123404       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: Failed to watch *v1.DaemonSet: failed to list *v1.DaemonSet: daemonsets.apps is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "daemonsets" in API group "apps" at the cluster scope
fluent-operator W0802 20:16:07.306561       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: failed to list *v1.Service: services is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "services" in API group "" at the cluster scope
fluent-operator E0802 20:16:07.306618       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "services" in API group "" at the cluster scope
fluent-operator W0802 20:16:10.656780       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "secrets" in API group "" at the cluster scope
fluent-operator E0802 20:16:10.656875       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "secrets" in API group "" at the cluster scope
fluent-operator W0802 20:16:19.079865       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "statefulsets" in API group "apps" at the cluster scope
fluent-operator E0802 20:16:19.079921       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: Failed to watch *v1.StatefulSet: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "statefulsets" in API group "apps" at the cluster scope
fluent-operator W0802 20:16:23.153158       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: failed to list *v1.ServiceAccount: serviceaccounts is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "serviceaccounts" in API group "" at the cluster scope
fluent-operator E0802 20:16:23.153227       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: Failed to watch *v1.ServiceAccount: failed to list *v1.ServiceAccount: serviceaccounts is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "serviceaccounts" in API group "" at the cluster scope

Is it possible to configure RBAC for Fluent Operator such that it doesn't need cluster-wide permissions on these resources while continuing to allow telemetry to be collected from all namespaces? If not, what functionality does the operator provide that requires such permissions?

How did you install fluent operator?

helm upgrade --install fluent-operator --create-namespace -n telemetry <chart_path>

Additional context

No response

wenchajun commented 3 months ago

That's a good idea. We first set it to the cluster level because essentially fluentbit is going to collect logs from all namespaces, tweak it for custom namespaces?

boatski commented 3 months ago

The operator shouldn't need full cluster scope access to daemonsets/statefulsets to collect logs from all namespaces, unless there's something I'm misunderstanding? It would only need those permissions in the namespace fluentbit/fluentd are deployed to. I can see that the operator does need cluster scope access for some plugins to collect from all namespaces so that it can grant those permissions to fluentbit/fluentd, such as the k8s plugin to get pod info.

Are there any use cases I'm maybe not considering that would require cluster scope permission on daemonsets/statefulsets from the operator?

sherwoodzern commented 1 month ago

Is there a work-around or solution for this issue? I'm encountering the same issue.