stackabletech / documentation

Stackable's central documentation repository built on Antora
https://docs.stackable.tech
Apache License 2.0
11 stars 11 forks source link

Document required RBAC permissions and why they are needed #379

Open fhennig opened 1 year ago

soenkeliebau commented 5 months ago

Spent some time investigating this today. I used the Trino operator and the Secret operator for these tests as examples, as I think the secret operator is probably the most complex one and Trino as representation of a "normal" operator.

Testing was done in a kind cluster on my machine with impersonation:

helm --kube-as-user=foo install secret-operator stackable/secret-operator --version 23.11.0

The user foo had no permissions at all initially and I added permissions as I got errors during the installation process to arrive at these roles and rolebindings:

# Cluster scoped resources
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: deploy-stackable-cluster
rules:
  - apiGroups:
      - "apiextensions.k8s.io"
    resources:
      - customresourcedefinitions
    verbs:
      - create
      - get
      - list
  - apiGroups:
      - "rbac.authorization.k8s.io"
    resources:
      - clusterroles
      - clusterrolebindings
    verbs:
      - get
      - list
      - create
  - apiGroups:
      - "storage.k8s.io"
    resources:
      - storageclasses
      - csidrivers
    verbs:
      - get
      - list
      - create
  - apiGroups:
      - "secrets.stackable.tech"
    resources:
      - secretclasses
    verbs:
      - get
      - list
      - create
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: deploy-stackable-cluster
subjects:
  - kind: User
    name: foo
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: deploy-stackable-cluster
  apiGroup: rbac.authorization.k8s.io

# Namespaced Resources
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: deploy-stackable-namespaced
rules:
  - apiGroups:
      - ""
    resources:
      - serviceaccounts
      - configmaps
      - secrets
    verbs:
      - get
      - list
      - create
  - apiGroups:
      - "apps"
    resources:
      - deployments
    verbs:
      - get
      - list
      - create
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: deploy-stackable-namespaced
subjects:
  - kind: User
    name: foo
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: deploy-stackable-namespaced
  apiGroup: rbac.authorization.k8s.io

This seems to allow installing the operator itself, however the helm chart also deploys serviceaccounts for the operators themselves, which require additional privileges, since users can only grant permissions they hold themselves:

Error: INSTALLATION FAILED: 3 errors occurred:
    * clusterroles.rbac.authorization.k8s.io "trino-clusterrole" is forbidden: user "foo" (groups=["system:authenticated"]) is attempting to grant RBAC permissions not currently held:
{APIGroups:[""], Resources:["configmaps"], Verbs:["get"]}
{APIGroups:[""], Resources:["secrets"], Verbs:["get"]}
{APIGroups:[""], Resources:["serviceaccounts"], Verbs:["get"]}
{APIGroups:["events.k8s.io"], Resources:["events"], Verbs:["create"]}
    * clusterroles.rbac.authorization.k8s.io "trino-operator-clusterrole" is forbidden: user "foo" (groups=["system:authenticated"]) is attempting to grant RBAC permissions not currently held:
{APIGroups:[""], Resources:["configmaps"], Verbs:["create" "delete" "get" "list" "patch" "update" "watch"]}
{APIGroups:[""], Resources:["endpoints"], Verbs:["create" "delete" "get" "list" "patch" "update" "watch"]}
{APIGroups:[""], Resources:["nodes"], Verbs:["list" "watch"]}
{APIGroups:[""], Resources:["pods"], Verbs:["create" "delete" "get" "list" "patch" "update" "watch"]}
{APIGroups:[""], Resources:["secrets"], Verbs:["create" "delete" "get" "list" "patch" "update" "watch"]}
{APIGroups:[""], Resources:["serviceaccounts"], Verbs:["create" "delete" "get" "list" "patch" "update" "watch"]}
{APIGroups:[""], Resources:["services"], Verbs:["create" "delete" "get" "list" "patch" "update" "watch"]}
{APIGroups:["apps"], Resources:["statefulsets"], Verbs:["get" "create" "delete" "list" "patch" "update" "watch"]}
{APIGroups:["authentication.stackable.tech"], Resources:["authenticationclasses"], Verbs:["get" "list" "watch"]}
{APIGroups:["batch"], Resources:["jobs"], Verbs:["create" "delete" "get" "list" "patch" "update" "watch"]}
{APIGroups:["events.k8s.io"], Resources:["events"], Verbs:["create"]}
{APIGroups:["opa.stackable.tech"], Resources:["regorules"], Verbs:["create" "get" "list" "watch" "patch"]}
{APIGroups:["policy"], Resources:["poddisruptionbudgets"], Verbs:["create" "delete" "get" "list" "patch" "update" "watch"]}
{APIGroups:["rbac.authorization.k8s.io"], Resources:["clusterroles"], ResourceNames:["trino-clusterrole"], Verbs:["bind"]}
{APIGroups:["rbac.authorization.k8s.io"], Resources:["rolebindings"], Verbs:["create" "delete" "get" "list" "patch" "update" "watch"]}
{APIGroups:["s3.stackable.tech"], Resources:["s3connections"], Verbs:["get" "list" "watch"]}
{APIGroups:["trino.stackable.tech"], Resources:["trinocatalogs"], Verbs:["get" "list" "watch"]}
{APIGroups:["trino.stackable.tech"], Resources:["trinoclusters"], Verbs:["get" "list" "patch" "watch"]}
{APIGroups:["trino.stackable.tech"], Resources:["trinoclusters/status"], Verbs:["patch"]}
    * clusterroles.rbac.authorization.k8s.io "trino-operator-clusterrole" not found

I have started adding these in a separate ClusterRole object to keep them a bit separate:

# Rights that are only needed as they are granted to the operator roles during the install
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: deploy-stackable-grant
rules:
  - apiGroups:
      - "events.k8s.io"
    resources:
      - events
    verbs:
      - create
  - apiGroups:
      - ""
    resources:
      - configmaps
      - endpoints
      - pods
      - secrets
      - serviceaccounts
      - services
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch
  - apiGroups:
      - ""
    resources:
      - nodes
    verbs:
      - list
      - watch
  - apiGroups:
      - "apps"
    resources:
      - deployments
      - statefulsets
      - daemonsets
    verbs:
      - get
      - create
      - delete
      - list
      - patch
      - update
      - watch
  - apiGroups:
      - "rbac.authorization.k8s.io"
    resources:
      - clusterroles
    verbs:
      - bind
  - apiGroups:
      - "rbac.authorization.k8s.io"
    resources:
      - rolebindings
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch
---
apiVersion: rbac.authorization.k8s.io/v1
# This cluster role binding allows anyone in the "manager" group to read secrets in any namespace.
kind: RoleBinding
metadata:
  name: deploy-stackable-grant
subjects:
  - kind: User
    name: foo
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: deploy-stackable-grant
  apiGroup: rbac.authorization.k8s.io

Not sure if this works yet, as I still get the full error message shown above. Maybe k8s reports all requested privileges as soon as one is missing .. I ran out of time at this point.

soenkeliebau commented 5 months ago

I've played around with the error message a bit and the following command should be able to convert what k8s outputs to permissions we can stick in a ClusterRole object:

grep "APIGroups" | awk '{r = gensub("({.*})", "\\1,", "g"); print r }' | awk '{r = gensub("(\"[^ ]+\") ", "\\1, ", "g"); print r}'  |tr -d "\n" | awk '{r = gensub("(?:|APIGroups|Resources|Verbs|ResourceNames)","\"\\1\"", "g"); print "[" r "]"}'  | yq -o yaml -P > permissions.yaml

90% of this is to compensate for Kubernetes outputting something that resembles json but isn't:

After those are fixed it is really just piping it through yq to convert it from json to yaml ..

fhennig commented 5 months ago

Is this something we can document once for all operators or do we need to document this for every operator individually? What could be a good way to document this? Feels like it's very detailed and prone to become outdated.