operator-framework / operator-lifecycle-manager

A management framework for extending Kubernetes with Operators
https://olm.operatorframework.io
Apache License 2.0
1.72k stars 545 forks source link

"Descope" OLM delivered operators #2437

Open njhale opened 3 years ago

njhale commented 3 years ago

Feature Request

Note: This issue mostly consists of select snippets from a document @ecordell drafted a while back. I've curated the important bits to frame the problem for further discussion.

Scoping, Descoping, What?

In short, when we talk about "scope" in OLM, we're talking about how OLM handles the privileges granted to an operator and its users with respect to the namespaces an admin configures it to install; i.e. the opinionated behavior of RBAC generation around ClusterServiceVersions, their InstallModes, and OperatorGroups.

Note: see the OperatorGroup docs for more details.

Problem

APIs in a kubernetes cluster are cluster-scoped. They are visible via discovery to any user that wishes to see them. Even operators that agree on a particular GVK may have differences of opinion in how those objects should be admitted to a cluster, or how conversion between API versions should happen.

With Operator Framework, we want to build an ecosystem of high-quality operators that can be re-used across different projects, whether they’re in the same cluster or not. But re-using operators compounds the scoping problems within a cluster - it increases the likelihood that more than one “opinion” about an API exists in the cluster.

History

When OLM was first written, CRDs defined only the existence of a GVK in a cluster. Operators developed for OLM could only install in a namespace, watching that namespace - this delivered on the self-service, operational-encoding story of operators. The same operator could be installed in every namespace of a cluster.

Privilege escalation became a concern - since operators are run with a service account in a namespace, anyone with the ability to create workloads in that namespace could escalate to the permissions of the operator. This made service provider/consumer relationships a difficult sell for operators in OLM.

At the same time, CRDs continued to add features. With version schemas and admission and conversion webhooks, CRDs no longer simply registered a global name for a type, and operators in separate namespaces had lots of options to interfere with one another if they shared the same CRD. OLM also expanded to support APIServices in addition to operators based on CRDs, and so required a notion of cluster-wide operators.

To address these concerns, a notion of scoping operators was introduced via the OperatorGroup object. An OperatorGroup would specify a set of namespaces within a cluster in which all operators installed would share the same scope. OLM would ensure that only one operator within a namespace owned a particular CRD to avoid collision problems, and more installation options were provided to allow separating operators from their managed workloads.

Proposal

Entirely remove the notion of scoping from OLM; i.e. "descope".

This means that:

It does not mean that:

Design!?

The specifics of how we will achieve descoping will need an enhancement proposal to be made clear. Such a proposal will, at minimum, need to cover:

bitscuit commented 2 years ago

Is there any proposal yet or draft of one?