More secure rbac - Githubissues

jkremser commented 5 months ago

Issue #624

the changes rely on WATCH_NAMESPACE variable. So if it is empty, we basically install almost the same rbac as before (ClusterRole with ClusterRoleBinding so that it can watch for events in all the namespaces).

If it contains one or multiple namespaces (so it assumes this change is merged), it will use normal RoleBinding together with the cluster role so that these rules are namespaced for each listed namespace in that env var.

Note: Normal RoleBinding + ClusterRole works the same way as RoleBinding with normal Role, but this way we can reuse the same resource and lower the complexity of helm chart a bit.

It also contains a way to restrict the set of secrets, the operator should be able to look into. By default it's the same behavior as before, (well, if WATCH_NAMESPACE is not empty, it will be restricted only to secrets in that ns), but one can also specify: --set permissions.operator.restrict.namesAllowList="{foo,bar}" so that the operator can read only secrets in that namespace called "foo" and "bar"

The other RBAC-related change is the whitelisting of CRDs that can be scaled by the operator. Again, by default it's the same behavior as before, but if requested, one can restrict the set of CRDs that can be referenced by scaled object in advance during the helm installation by:

rbac:
  enabledCustomScaledRefKinds: true
  scaledRefKinds:
   - apiGroup: argoproj.io
     kind: Rollout
   - apiGroup: cluster.x-k8s.io
     kind: MachineDeployment
   - apiGroup: cluster.x-k8s.io
     kind: MachinePool

The PR also splits the RBAC for all three individual components so that webhooks or metric server don't have the same rights as the operator itself(that actually had a right to read ("get") every single resource in the cluster). Now each component has its own service account with a taylored roles attached to it.

Since the PR is quite complex to review, I've prepared some deployment scenarios and captured the resulting RBAC using the rbac-tool kubectl plugin.

Scenario 1 - no restrictions, just separate RBACs for each KEDA component (which is common to each scenario)

values.yaml
```bash crds: install: true image: keda: repository: docker.io/jkremser/keda # this contains the unreleased change for watching multiple explicitly mentioned namespaces tag: latest webhooks: repository: docker.io/jkremser/keda-admission-webhooks tag: latest logging: operator: level: debug format: json stackTracesEnabled: true ```

final rbac:
- operator: link to visualization
- webhook: link to visualization
- metrics-server: link to visualization
Scenario 2 - some CRDs are whitelisted, watching default ns + 2 whitelisted secrets

values.yaml
```bash # the values.yaml from Scenario 1 + the following: watchNamespace: default rbac: enabledCustomScaledRefKinds: true scaledRefKinds: - apiGroup: argoproj.io kind: Rollout - apiGroup: cluster.x-k8s.io kind: MachineDeployment - apiGroup: cluster.x-k8s.io kind: MachinePool permissions: operator: restrict: namesAllowList: - secretPassword - someApiKey ```

final rbac:
- operator: link to visualization
- webhook: link to visualization
- metrics-server: link to visualization
Scenario 3 - all CRDs are allowed + some namespaces are being watched + some secrets are whitelisted

values.yaml
```bash # the values.yaml from Scenario 1 + the following: rbac: enabledCustomScaledRefKinds: true watchNamespace: default,keda,production permissions: operator: restrict: namesAllowList: - foo - bar - baz ```

final rbac:
- operator: link to visualization
- webhook: link to visualization
- metrics-server: link to visualization
Scenario 4 - all namespaces are watched, only secrets called "bar" can be read

values.yaml
```bash # the values.yaml from Scenario 1 + the following: watchNamespace: "" # all namespaces -> cluster-wide rights enabledCustomScaledRefKinds: false permissions: operator: restrict: namesAllowList: - bar ```

final rbac:
- operator: link to visualization
- webhook: link to visualization
- metrics-server: link to visualization

note: for all four scenarios, I've created a Deployment, ScaledObject, checked that hpa was created and deleted it and was checking if there were any rbac related issues

JorTurFer commented 5 months ago

WOW, this PR is a great idea! I'm checking it in depth because you said, it's complex and RBAC is risky, but I'll post something asap

joebowbeer commented 2 weeks ago

@jkremser Should the instructions for restricting access to secrets be updated?

https://keda.sh/docs/2.15/operate/cluster/#restrict-secret-access

Also see https://github.com/kedacore/keda-docs/issues/1307

kedacore / charts

More secure rbac #625