[Proposal] - Dynamic Cache for Gracefully Handling RBAC Changes

everettraven commented 1 year ago

Problem Statement: As an operator author I want to develop an operator that can handle changing permissions so that cluster admins can use Role Based Access Control (RBAC) to scope the permissions given to my operator.

In order to grant cluster admins the ability to scope the permissions given to operators, we need to provide operator authors an easy way to handle dynamically changing permissions.

Background/Context

There has recently been questions posed in various avenues regarding the ability to scope operator permissions. There is currently a limitation with the existing cache implementations in that they are not able to reliably handle changes in RBAC and require specific permissions to be given to it's associated ServiceAccount at all times. If the permissions are removed the controller will either crash or enter an infinite loop that blocks reconciliation.

This proposal introduces the concept of adding a cache implementation that would allow for dynamically handling changes in RBAC and will cover advantages, disadvantages, and introduce an existing Proof-of-Concept.

Advantages

For Operator/Controller Authors

Makes controllers more resilient to changes in RBAC
Dynamically add/remove informers as needed
A single caching layer that can manage both cluster scoped and namespace scoped informers
Operators/controllers created with this can be seen as more secure

For Cluster Admins

Allows for an operator/controller's permissions to be configured following least-privilege principle and gives cluster-admin’s more control over what an Operator/controller can and can not do.

For the Operator/Controller Ecosystem

There are numerous industries where security is a big factor in allowing certain software to run on clusters. The extra security brought by being able to scope the permissions of an operator/controller opens up the opportunity for these industries to start adopting the Operator pattern.

There are likely more advantages that are not listed here.

Disadvantages

For Operator/Controller Authors

Using this caching layer will likely result in authors having to adopt a new pattern for establishing watches
Introduces some new complexity for authors to ensure that their implementation works in various conditions

There are likely more disadvantages that are not listed here.

Proof of Concept

I have worked on a few iterations of a PoC for this, but the latest and most promising one can be found here: https://github.com/everettraven/telescopia

There is also a sample operator that uses the telescopia library here: https://github.com/everettraven/scoped-operator-poc/tree/poc/telescopia (specifically the poc/telescopia branch). There is a demo in the README that I highly recommend taking a look at to get a better idea of how an operator/controller may behave with this concept implemented.

I would love to get feedback and thoughts on the advantages and disadvantages of this as well as if implementing this is something that would be of value to the controller-runtime project and the community.

I look forward to discussing this further!

alvaroaleman commented 1 year ago

cc @vincepri

alvaroaleman commented 1 year ago

Could you define what exactly that would allow for dynamically handling changes in RBAC means? RBAC gets added or removed to a SA used by a controller, what exactly is supposed to happen?

everettraven commented 1 year ago

Could you define what exactly that would allow for dynamically handling changes in RBAC means?

IMO this means that when changes are made to the RBAC on a SA used by a controller, that controller won't crash or enter a blocking loop and will continue to operate as expected. Maybe this is better phrased as that would allow for gracefully handling changes in RBAC?

RBAC gets added or removed to a SA used by a controller, what exactly is supposed to happen?

TLDR:

Adding RBAC - Nothing happens until reconciliation of a CR that causes watches to be created that would require this RBAC
Removing RBAC - Informers are killed, affected child resources are "abandoned" (would be picked back up when/if RBAC is recreated), controller updates CR status during next reconciliation that it no longer has the necessary permissions to do X,Y,Z (whatever it needed those permissions to do) and continues operating normally.

A more detailed answer:

I'm going to try answering this question with a scenario that will hopefully show what should happen when adding/removing RBAC.

Let's consider a controller that reconciles the Foo CR. Let's assume that a result of successful reconciliation a Secret and a Deployment are created. Using this dynamic cache, the controller does not establish any watches other than a cluster-wide watch on the Foo CR.

When installing this controller a cluster admin creates the RBAC that allows the controller to CRUD a Secret and Deployment in the foo and bar namespaces but no other namespaces. Nothing happens upon adding the RBAC until a CR is created to be reconciled (this introduces a pattern of establishing minimal watches only when they are needed).

So, lets see what should happen when a Foo CR is created in the foo and bar namespace:

Watches are established for Secret and Deployment resources in the foo and bar namespace (when following minimal watch pattern this should include some kind of label/field selector)
A Secret is created in each namespace
A Deployment is created in each namespace
Foo CR status is updated to signify successful reconciliation.

Now, what should happen if a Foo CR is created in the baz namespace (or any other not permitted namespace):

Watches are attempted to be established for Secret and Deployment resources in the baz namespace, but a Forbidden error is encountered
Foo CR status is updated to signify that the controller does not have the proper permissions to operate in this namespace

Now lets consider what happens if a cluster admin says "Oops, I meant for it to be allowed to operate in the baz namespace but not bar!". The cluster admin would remove the RBAC for the bar namespace and add the RBAC for the baz namespace, causing:

Watches to be killed for the Secret and Deployment resources in the bar namespace
On next reconciliation of Foo CR in the namespace bar:
- Controller recognizes that the watch no longer exists and attempts to re-establish them but this time gets a Forbidden error
- Foo CR status is updated to signify that the controller does not have the proper permissions to operate in this namespace
On next reconciliation of Foo CR in the namespace baz:
- Watches are established for Secret and Deployment resources in the baz namespace
- A Secret is created
- A Deployment is created
- Foo CR status is updated to signify successful reconciliation.

For just an added bonus scenario, what happens if the cluster admin wants to no longer allow the controller to have cluster-wide permissions for Foo CRs:

Cluster-wide watches are not killed by default
Operator will throw errors until the permissions are recreated then will continue operating as normal

One thing I just realized I omitted in the original proposal comment is this also introduces the idea of a controller having a set of minimal required permissions (i.e CRUD its own CRs) and a set of optional or scopeable permissions.

This is all open to discussion and subject to change based on said discussion, but these are the concepts that my proof of concept and example operator currently use.

@alvaroaleman I hope this answers your questions, but I'm happy to answer any follow up questions or make any clarifications if needed.

alvaroaleman commented 1 year ago

Foo CR status is updated to signify that the controller does not have the proper permissions to operate in this namespace

We can't update the status of something we can't access.

IMHO this is still a bit too vague, will continue to operate as expected can mean very different things based on what the controller does. I also don't think its realistic to try hiding that. Dynamically re-establishing watches with a different configuration based on failures alone is something that doesn't sound to me like it can be generalized very well.

In general, there is some on-going work to allow removing watches at runtime, but I guess I am not sure it is a good idea to do this automatically under a controller by making some assumptions about how the new RBAC might potentially look like: https://github.com/kubernetes-sigs/controller-runtime/pull/2159

To me personally it seems more correct to have a component that manages the RBAC and that will update the configuration of the affected controllers when it changes RBAC.

everettraven commented 1 year ago

We can't update the status of something we can't access.

Correct, but in this particular scenario the controller would have access to the Foo CR (it's CR) just not the ability to create Secret or Deployment Resources. If the controller doesn't have access to the Foo CR it will throw errors in the logs until it has the permissions again, but the key is that it doesn't end up entering an unrecoverable state.

IMHO this is still a bit too vague, will continue to operate as expected can mean very different things based on what the controller does. I also don't think its realistic to try hiding that.

Fair enough, I may just not be wording this the best but what my aim for this is to make it so that operator/controller authors can configure their controllers to use this caching layer to prevent their operator/controller from crashing, panicking, or entering some unrecoverable state when RBAC has changed.

Dynamically re-establishing watches with a different configuration based on failures alone is something that doesn't sound to me like it can be generalized very well.

This might be a misunderstanding due to not providing enough context on my end. With how I have my PoC implemented the watches/informers will be killed automatically (although we could make this configurable) but using this cache would result in a bit of a new pattern for operator/controller authors to adopt when reconciling a CR. The operator/controller author would be responsible for checking if the necessary watches/informers exist in the cache and if they don't recreate them, the cache won't recreate any watches/informers.

In general, there is some on-going work to allow removing watches at runtime, but I guess I am not sure it is a good idea to do this automatically under a controller by making some assumptions about how the new RBAC might potentially look like: https://github.com/kubernetes-sigs/controller-runtime/pull/2159

Thanks for sharing this PR, I'll definitely take a look at this.

IMO I think it would be ok for the cache to automatically remove watches from the cache if they crash due to a permissions error as long as it is well documented and an operator/controller author is informed of this behavior if they decide to use this cache implementation. That being said, I think we could also discuss changing that behavior to make that configurable via options.

To me personally it seems more correct to have a component that manages the RBAC and that will update the configuration of the affected controllers when it changes RBAC.

So essentially restart the operator/controller in it's entirety if the configuration is modified? I'm not sure this would resolve the core problem of preventing the operator from entering an unrecoverable state if the permissions are changed, but I am open to discussing this more if it is agreed that this is the better route for functionality like this.

everettraven commented 1 year ago

I just wanted to update this issue with a link to a brief design document that may help outline the goals and non-goals of this a bit better. I also attempted to highlight how the changes in #2159 could potentially be built upon to accomplish the goals of this dynamic cache.

cc'ing @varshaprasad96 @joelanford @vincepri to try and get some more eyes on this. Thanks!

everettraven commented 1 year ago

Looks like #2231 was recently created that asks for the functionality to set the WatchErrorHandler - which is a proposed step in my design document

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

inteon commented 1 year ago

@everettraven It sounds like the same mechanism can be used to only watch when a CRD is installed?

everettraven commented 1 year ago

It sounds like the same mechanism can be used to only watch when a CRD is installed?

I would assume so, but this is a use case I have not tested. As long as you check the CRD is installed in your reconciliation loop and then if it is create the watch it should work.

akalenyu commented 1 year ago

@everettraven It sounds like the same mechanism can be used to only watch when a CRD is installed?

Interesting! There is another issue potentially in the same basket; If the CRD was not available on the first informer setup, It looks like the cached client will never recover and just keep returning IsNoMatchError forever.

EDIT: My bad, this is due to using discovery mapper instead of dynamic. But there might be a bug in the defaulting mechanism in pkg/cache, will look into it EDIT: https://github.com/kubernetes-sigs/controller-runtime/pull/2491

k8s-triage-robot commented 8 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 8 months ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/controller-runtime/issues/2176#issuecomment-1901921339): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

vincepri commented 7 months ago

/reopen

k8s-ci-robot commented 7 months ago

@vincepri: Reopened this issue.

In response to [this](https://github.com/kubernetes-sigs/controller-runtime/issues/2176#issuecomment-1904625202): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

vincepri commented 7 months ago

/lifecycle frozen

kubernetes-sigs / controller-runtime