cloudfoundry / cf-crd-explorations

Apache License 2.0

3 stars 2 forks source link

Explore: Admission controllers for authorization #36

Closed gcapizzi closed 3 years ago

gcapizzi commented 3 years ago

Background

Currently, CF on VMs relies on a set of roles for authorization, each role having specific permissions. Role permissions are hardcoded and information about which user has which role is stored in the Cloud Controller database.

The Kubernetes way of handling authorization is quite different: the most common method is Role-based Access Control (RBAC). Permissions are expressed as the possibility of performing standard operations (verbs, like get, watch or list) on resources. These permissions are stored in (Cluster)?Roles and bound to users via (Cluster)?RoleBindings. In #35 we try to understand wether RBAC alone is a viable approach for us.

A much more sophisticated way of performing authorization on Kubernetes is the use of admission controllers, which allow to implement custom authorization logic. The best way to leverage this is probably Gatekeeper, an admission controller which delegates authorization to the Open Policy Agent (OPA). OPA makes it possible to express authorization rules using a declarative logic language called Rego.

Questions

We want to understand how feasible and practical it would be to introduce an admission controller to perform those authorisations that don't map well to RBAC.

In order to do this, we also need to answer the storage question: where do we store roles and role bindings? Some things to investigate:

can we extend the set of verbs in a (Cluster)?Role to suit our needs? If yes, we could then declare CF roles as regular (Cluster?)Roles
can we create a set of special, empty (Cluster)?Roles that would be recognised and treated differently by our admission controller?
do any of the above solutions interfere with regular RBAC?
do we need to come up with our resources for roles and bindings?

We also want to evaluate OPA Gatekeeper against a custom admission controller:

which one would be more feasible?
- how do we handle state stored in resources from OPA Rego rules?
which one would be more expensive?
which one would be more secure?

kieron-dev commented 3 years ago

We are investigating the possibility of reusing the RBAC as is, and writing rules to supplement it in OPA.

We have OPA installed with kube-mgmt in Kind using the https://github.com/eirini-playground/auth-explore/blob/master/opa/setup.sh script.

We can create a ClusterRole called space-developer, and a RoleBinding in the test namespace for the user alice to the space-developer ClusterRole.

We can then configure kube-mgmt to replicate the role bindings to OPA using the --replicate=rbac.authorization.k8s.io/v1/rolebindings argument.

This should give us enough power to write an rego rule to get the current user's role names by looking at the role bindings, and to allow a certain action, secret creation say, if the space-developer role is one of the user's roles.

At the moment, OPA is failing to replicate the rolebindings. The error message from kube-mgmt is

kube-mgmt time="2021-07-23T15:29:58Z" level=error msg="Sync for rbac.authorization.k8s.io/v1/rolebindings failed due to OPA error. Trying again in 1s. Reason: list: rolebindings.rbac.authorization.k8s.io is forbidden: User "system:serviceaccount:opa:default" cannot list resource "rolebindings" in API group "rbac.authorization.k8s.io" at the cluster scope"

The kube-mgmt arg --replicate-cluster is used to replicate cluster resources. We have used the --replicate arg, which is for namespaced resources. So we're currently stumped as to why this is failing.

kieron-dev commented 3 years ago

The story mentions using Gatekeeper. We are using kube-mgmt since we had a scripted installation already. Maybe we could split off another story to evaluate whether Gatekeeper is preferred to kube-mgmt. I had a recollection that Gatekeeper was still in beta. But maybe that's what the OPA docs say, and they might be out of date. Gatekeeper appears to be at v3.5.1 which doesn't sound particularly beta to me.

Anyway, it's probably worth deferring this until after we've had some experience with kube-mgmt so we can make an informed judgement.

kieron-dev commented 3 years ago

You can replicate rolebindings in kube-mgmt by setting appropriate rbac for rolebindings, as this is not included in the view cluster role. e.g.

 kind: ClusterRole
 apiVersion: rbac.authorization.k8s.io/v1
 metadata:
   name: role-viewer
 rules:
 - apiGroups: ["rbac.authorization.k8s.io"]
   resources: ["rolebindings"]
   verbs: ["get", "list", "watch"]
 ---
 kind: ClusterRoleBinding
 apiVersion: rbac.authorization.k8s.io/v1
 metadata:
   name: rolebinding-viewer
 roleRef:
   kind: ClusterRole
   name: role-viewer
   apiGroup: rbac.authorization.k8s.io
 subjects:
 - kind: Group
   name: system:serviceaccounts:opa
   apiGroup: rbac.authorization.k8s.io

Then we can see rolebindings by this rego rule:

package kubernetes.admission

import data.kubernetes.rolebindings

operations = {"CREATE", "UPDATE"}

deny[msg] {
  input.request.kind.kind == "Secret"
  msg = sprintf("rolebindings: %q", [rolebindings])
}

which outputs all the rolebindings when someone tries to create a secret.

danail-branekov commented 3 years ago

We managed to enhance the setup script to setup a cluster with OPA enabled, DEX (with a single alice@vcap.me user), CAPI custom resources, CAPI sample resources, our RBAC
We assigned the space-developer role to alice via applying this
Being a space developer, alice could edit droplets (as specified in the RBAC). Now, we wanted to limit the update capabilities of alice to labels and annotations only.
To do that we came up with this rego rule which would deny the request if the old object and new object differ by anything else but labels and annotations
Having applied the rule above, as alice, we managed to kubectl apply a change of a label but changing the image was denied :)

Next, we are going to check the rolebindings of the user to enable the rule to space developers only

eirinici commented 3 years ago

Today we managed to come up with a rego rule that limits users assigned to the space-developer role to only be able to update labels and annotations of Droplets. The OPA rule is applied on top of RBAC, i.e. if RBAC rules are violated, the OPA one is never evaluated.

The rule will not affect updates from users that are not assigned to the space-developer role.

There are two ways to assign a user to a role:

Directly bind the user to the role via a subject with kind==User
Bind a group that the user belongs to to the role via a subject with kind==Group

Getting groups to work is a bit tricky:

Users and groups are not persisted into K8S, instead K8S trusts the JWT tokens (where users and groups are described) and just "aggregates them"
It is IDP's responsibility to populate the groups the user belongs to
In order groups to be available to K8S, one should configure OIDC group claim in the cluster config like this
In order to easily test groups we forked Dex and hacked it to always assign the user to the goo group

As a result:

If no rolebingins are applied, alice cannot update Droplets at all because of RBAC
If alice-space-developer-rolebinding is applied, then alice can only update labels and annotations to Droplets but cannot update anything else (because of the OPA rule)
If goo-space-developer-rolebinding is applied, then alice can only update labels and annotations to Droplets (as alice belongs to the goo group) but cannot update anything else (because of the OPA rule)

georgethebeatle commented 3 years ago

Let's attempt to answer some of the questions posed in the story, given our current understanding of OPA

where do we store roles and role bindings? There are generally speaking 2 alternatives for this:

If we write our own admission controller it is up to us

If we use RBAC+OPA/Gatekeeper we store all authorization information as Kubernetes Custom Resources (Roles, Binding, ConfigMaps, Gatekeeper CRDs, etc)

can we extend the set of verbs in a (Cluster)?Role to suit our needs? If yes, we could then declare CF roles as regular (Cluster?)Roles can we create a set of special, empty (Cluster)?Roles that would be recognised and treated differently by our admission controller?

In order to define our own verbs it is likely that we will have to write our own admission controller. Give our experience with RBAC+OPA so far it seems to cover our needs pretty well and just using it looks better than writing our own webhook/admission controller.

do any of the above solutions interfere with regular RBAC? do we need to come up with our resources for roles and bindings?

It feels more like OPA extends RBAC rather than interfering with it. You generally get stopped by RBAC firs, but if desired can further restrict access using OPA rules. That said we recommend that as much of the authorization logic as possible is covered by RBAC and only use OPA rules to address minor exceptions (rego rules are harder to get right than roles + bindings)

Note: Any user who is allowed to define RBAC roles and bindings on the cf k8s cluster can "change" the authorizaton logic by doing so. Any such user should take care not to open up holes in the cf authorization logic.

danail-branekov commented 3 years ago

We managed to implement a Gatekeeper rule (constraint template and constraint) that limits users that have the space-developer role assigned to only be able to modify labels and annotations on Droplets. This is similar to the rule we had in the kube-mgmt version.

As a matter of fact, kube-mgmt is referred to as Gatekeeper v1, where as Gatekeeper is sort of v2. Both admission controllers are quite similar, their rules are written in rego, the difference is the way those rules are made available to K8S:

In kube-mgmt rules are pure rego code that are put inside a configmap
In Gatekeeper there are dedicated custom resources that have the rego code in their spec

With regards to which one we should use, maybe we should go for Gatekeeper if it is out of beta phase because:

Gatekeeper rules can take parameters which significantly reduces the number of rules we would otherwise have to write in kube-mgmt
Gatekeeper rules are bound to objects declaratively in the Constraint custom resource, whereas in kube-mgmt rules should always check the modified object kind first
Gatekeeper provides the Audit feature which can be used to evaluate preexisting object. This is however out of the "authorisation" domain, it is more about validation
Gatekeeper is configured via a ConfigMap, while kube-mgmt seems to be configured as command arguments in the deployment