kubernetes-sigs / controller-runtime

Repo for the controller-runtime subproject of kubebuilder (sig-apimachinery)
Apache License 2.0
2.43k stars 1.12k forks source link

[Proposal] - Dynamic Cache for Gracefully Handling RBAC Changes #2176

Open everettraven opened 1 year ago

everettraven commented 1 year ago

Problem Statement: As an operator author I want to develop an operator that can handle changing permissions so that cluster admins can use Role Based Access Control (RBAC) to scope the permissions given to my operator.

In order to grant cluster admins the ability to scope the permissions given to operators, we need to provide operator authors an easy way to handle dynamically changing permissions.

Background/Context

There has recently been questions posed in various avenues regarding the ability to scope operator permissions. There is currently a limitation with the existing cache implementations in that they are not able to reliably handle changes in RBAC and require specific permissions to be given to it's associated ServiceAccount at all times. If the permissions are removed the controller will either crash or enter an infinite loop that blocks reconciliation.

This proposal introduces the concept of adding a cache implementation that would allow for dynamically handling changes in RBAC and will cover advantages, disadvantages, and introduce an existing Proof-of-Concept.

Advantages

For Operator/Controller Authors

For Cluster Admins

For the Operator/Controller Ecosystem

There are likely more advantages that are not listed here.

Disadvantages

For Operator/Controller Authors

There are likely more disadvantages that are not listed here.

Proof of Concept

I have worked on a few iterations of a PoC for this, but the latest and most promising one can be found here: https://github.com/everettraven/telescopia

There is also a sample operator that uses the telescopia library here: https://github.com/everettraven/scoped-operator-poc/tree/poc/telescopia (specifically the poc/telescopia branch). There is a demo in the README that I highly recommend taking a look at to get a better idea of how an operator/controller may behave with this concept implemented.


I would love to get feedback and thoughts on the advantages and disadvantages of this as well as if implementing this is something that would be of value to the controller-runtime project and the community.

I look forward to discussing this further!

alvaroaleman commented 1 year ago

cc @vincepri

alvaroaleman commented 1 year ago

Could you define what exactly that would allow for dynamically handling changes in RBAC means? RBAC gets added or removed to a SA used by a controller, what exactly is supposed to happen?

everettraven commented 1 year ago

Could you define what exactly that would allow for dynamically handling changes in RBAC means?

IMO this means that when changes are made to the RBAC on a SA used by a controller, that controller won't crash or enter a blocking loop and will continue to operate as expected. Maybe this is better phrased as that would allow for gracefully handling changes in RBAC?

RBAC gets added or removed to a SA used by a controller, what exactly is supposed to happen?

TLDR:

A more detailed answer:

I'm going to try answering this question with a scenario that will hopefully show what should happen when adding/removing RBAC.

Let's consider a controller that reconciles the Foo CR. Let's assume that a result of successful reconciliation a Secret and a Deployment are created. Using this dynamic cache, the controller does not establish any watches other than a cluster-wide watch on the Foo CR.

When installing this controller a cluster admin creates the RBAC that allows the controller to CRUD a Secret and Deployment in the foo and bar namespaces but no other namespaces. Nothing happens upon adding the RBAC until a CR is created to be reconciled (this introduces a pattern of establishing minimal watches only when they are needed).

So, lets see what should happen when a Foo CR is created in the foo and bar namespace:

Now, what should happen if a Foo CR is created in the baz namespace (or any other not permitted namespace):

Now lets consider what happens if a cluster admin says "Oops, I meant for it to be allowed to operate in the baz namespace but not bar!". The cluster admin would remove the RBAC for the bar namespace and add the RBAC for the baz namespace, causing:

For just an added bonus scenario, what happens if the cluster admin wants to no longer allow the controller to have cluster-wide permissions for Foo CRs:

One thing I just realized I omitted in the original proposal comment is this also introduces the idea of a controller having a set of minimal required permissions (i.e CRUD its own CRs) and a set of optional or scopeable permissions.

This is all open to discussion and subject to change based on said discussion, but these are the concepts that my proof of concept and example operator currently use.

@alvaroaleman I hope this answers your questions, but I'm happy to answer any follow up questions or make any clarifications if needed.

alvaroaleman commented 1 year ago

Foo CR status is updated to signify that the controller does not have the proper permissions to operate in this namespace

We can't update the status of something we can't access.

IMHO this is still a bit too vague, will continue to operate as expected can mean very different things based on what the controller does. I also don't think its realistic to try hiding that. Dynamically re-establishing watches with a different configuration based on failures alone is something that doesn't sound to me like it can be generalized very well.

In general, there is some on-going work to allow removing watches at runtime, but I guess I am not sure it is a good idea to do this automatically under a controller by making some assumptions about how the new RBAC might potentially look like: https://github.com/kubernetes-sigs/controller-runtime/pull/2159

To me personally it seems more correct to have a component that manages the RBAC and that will update the configuration of the affected controllers when it changes RBAC.

everettraven commented 1 year ago

We can't update the status of something we can't access.

Correct, but in this particular scenario the controller would have access to the Foo CR (it's CR) just not the ability to create Secret or Deployment Resources. If the controller doesn't have access to the Foo CR it will throw errors in the logs until it has the permissions again, but the key is that it doesn't end up entering an unrecoverable state.

IMHO this is still a bit too vague, will continue to operate as expected can mean very different things based on what the controller does. I also don't think its realistic to try hiding that.

Fair enough, I may just not be wording this the best but what my aim for this is to make it so that operator/controller authors can configure their controllers to use this caching layer to prevent their operator/controller from crashing, panicking, or entering some unrecoverable state when RBAC has changed.

Dynamically re-establishing watches with a different configuration based on failures alone is something that doesn't sound to me like it can be generalized very well.

This might be a misunderstanding due to not providing enough context on my end. With how I have my PoC implemented the watches/informers will be killed automatically (although we could make this configurable) but using this cache would result in a bit of a new pattern for operator/controller authors to adopt when reconciling a CR. The operator/controller author would be responsible for checking if the necessary watches/informers exist in the cache and if they don't recreate them, the cache won't recreate any watches/informers.

In general, there is some on-going work to allow removing watches at runtime, but I guess I am not sure it is a good idea to do this automatically under a controller by making some assumptions about how the new RBAC might potentially look like: https://github.com/kubernetes-sigs/controller-runtime/pull/2159

Thanks for sharing this PR, I'll definitely take a look at this.

IMO I think it would be ok for the cache to automatically remove watches from the cache if they crash due to a permissions error as long as it is well documented and an operator/controller author is informed of this behavior if they decide to use this cache implementation. That being said, I think we could also discuss changing that behavior to make that configurable via options.

To me personally it seems more correct to have a component that manages the RBAC and that will update the configuration of the affected controllers when it changes RBAC.

So essentially restart the operator/controller in it's entirety if the configuration is modified? I'm not sure this would resolve the core problem of preventing the operator from entering an unrecoverable state if the permissions are changed, but I am open to discussing this more if it is agreed that this is the better route for functionality like this.

everettraven commented 1 year ago

I just wanted to update this issue with a link to a brief design document that may help outline the goals and non-goals of this a bit better. I also attempted to highlight how the changes in #2159 could potentially be built upon to accomplish the goals of this dynamic cache.

cc'ing @varshaprasad96 @joelanford @vincepri to try and get some more eyes on this. Thanks!

everettraven commented 1 year ago

Looks like #2231 was recently created that asks for the functionality to set the WatchErrorHandler - which is a proposed step in my design document

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

inteon commented 1 year ago

@everettraven It sounds like the same mechanism can be used to only watch when a CRD is installed?

everettraven commented 1 year ago

It sounds like the same mechanism can be used to only watch when a CRD is installed?

I would assume so, but this is a use case I have not tested. As long as you check the CRD is installed in your reconciliation loop and then if it is create the watch it should work.

akalenyu commented 1 year ago

@everettraven It sounds like the same mechanism can be used to only watch when a CRD is installed?

Interesting! There is another issue potentially in the same basket; If the CRD was not available on the first informer setup, It looks like the cached client will never recover and just keep returning IsNoMatchError forever.

EDIT: My bad, this is due to using discovery mapper instead of dynamic. But there might be a bug in the defaulting mechanism in pkg/cache, will look into it EDIT: https://github.com/kubernetes-sigs/controller-runtime/pull/2491

k8s-triage-robot commented 8 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 8 months ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/controller-runtime/issues/2176#issuecomment-1901921339): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
vincepri commented 7 months ago

/reopen

k8s-ci-robot commented 7 months ago

@vincepri: Reopened this issue.

In response to [this](https://github.com/kubernetes-sigs/controller-runtime/issues/2176#issuecomment-1904625202): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
vincepri commented 7 months ago

/lifecycle frozen