Open dghubble opened 4 years ago
Thanks for raising the concerns; would be great to see if we can tighten some of those permissions up.
I don't think we're planning to sunset the manifests in the near term but there are many drivers for the move towards the operator and it is not intended to be "dev only".
If the community would stop releasing new k8s distros, maybe we could backtrack on the operator; deal? :laughing:
FWIW, a lot of the permissions we have come from the set-up on OpenShift, where the operator has to run in one namespace and the Calico components in another (and still more for the Calico Enterprise components). Since Secrets are namespaced and multiple components need the secretes, we need the operator to copy various secrets from its namespace to others. Would be great if we could lock that down to specific named namespaces and secrets.
We should be able to lock down to specific named resources using RBAC as well. That was the intention from the beginning, but we haven't gotten that far yet!
Will you continue to maintain (operatorless) manifests?
There is no intention to remove support for these in the near to medium term. However, as we continue to make strides with the operator our intention is to more and more strongly recommend that approach. That's of course all subject to change based on the unknowable future!
Feedback like this issue is really important to receive and will help us make sure the operator approach is meeting all of the right community needs, as well as the purposes that it originally set out to do.
Do you intend to make Tigera Operator a required component?
Similar to the above, definitely not in the short to medium term. Maybe some day, but it's pretty hard to see that happening.
A similar story - it wasn't too long ago that installing Calico using a DaemonSet was new and exciting, and worrying to some! Most users installed Calico underneath Kubernetes rather than on top of it. We switched, got lots of feedback, incorporated it, and nowadays very very few users are not using a DaemonSet to install Calico, but it is still possible to do so. My hope is that we go through the same process with the operator.
Thanks for the details and thought around this.
I definitely understand you want to provide ease of use and support many different platforms/distros/users. Relevant to this is trying to design for both end-users (humans following Calico tutorials who may find value in an installer doing things for them) and designing a component for use in other clusters/systems.
As a distro, my end-users won't see whether the cluster came with the right manifests or an operator added them. They will see that the RBAC profiles increased in scope. I can't speak to Openshift and their needs. But at the moment, in my distro, tigera operator would be an unprecedented level of access, but I can stick to loosely basing off the operatorless manifests as I've done historically.
I know operators are pretty open ended. Some focus on managing their own defined custom resources and that seems pretty reasonable (foo operator manages Foo resources, sure). Some also take on being installers and that worries me some. Its hard for me to justify handing out cluster-wide access to ClusterRoles, Deployments, DaemonSets when I could just as well have those made at cluster bootstrap, leaving the steady-state RBAC strictly limited.
I do wonder if there are other ways you could aim to reduce complexity and ease maintainability, perhaps complimentary to having an operator installer. While I don't have the expectation that Calico provides a tutorial for every case (part of the distro's job), I'm sure your issue tracker feels that burden regardless. Another angle might be borrowing ideas from other CNI providers (central configmap vs env vars, MTU detection, delegation of vendor examples, clamp down on CRD creep). I don't presume to know solutions, I don't and I'm sure its something you're always thinking about. But I do see the problem too. Calico is the default in my distro, but it does admittedly have the most moving parts.
In fairness, I don't want to leave a broad issue that isn't tracking a concrete thing. You've both provided insights into the operator/installer's direction and goals (which answers my initial questions, thx) and I've shared some counterpoint concerns about it to maybe consider.
I can close if you want to track potential RBAC scope reductions somewhere else.
Expected Behavior
Without Tigera operator, Calico was deployed with a restrictive ClusterRole, that could mostly just get/list pods (example).
Current Behavior
Calico v3.16 docs and releases appear to be favoring Tigera Operator more.
I've seen some projects introduce an operator as a mechanism to create the manifests that a user would typically just create directly (rather than for custom APIs). I imagine that's the story behind Tigera Operator. The cost is that Tigera Operator is using high levels of access to the cluster (ClusterRole)
Effectively, Tigera Operator is running as a cluster-admin.
Possible Solution
Calico continues to maintain (operatorless) manifests. Recommends them for production. Components are created dirctly (e.g. DaemonSet, etc). Production users continue to use a limited ClusterRole.
Context
What are you trying to accomplish?
Maintaining direct control over what is applied to clusters and using limited RBAC.
Your Environment