NetApp / trident

Storage orchestrator for containers
Apache License 2.0
762 stars 222 forks source link

RBAC improvements #757

Closed msau42 closed 1 year ago

msau42 commented 2 years ago

Describe the solution you'd like We have removed many RBAC permissions from Trident for ONTAP deployments. The biggest improvements IMO are:

We have also been able to reduce many permissions because we don't use the operator to deploy and instead manually deploy the Trident controller and daemonset. Consider making these improvements and also separating out the operator to have its own set of permissions.

Here's a diff of the RBAC changes we have made. I added some comments explaining why they were modified/removed.

--- a/trident-clusterrole.yaml
+++ b/trident-clusterrole.yaml
@@ -5,34 +5,26 @@ metadata:
@@ -5,34 +5,26 @@ metadata:
   name: trident-controller-csi
 rules:
   - apiGroups: [""]
-   # Only needed for operator?
-    resources: ["namespaces"]
-    verbs: ["get", "list"]
-    # Don't need PVC create/delete permissions
-  - apiGroups: [""]
-    resources: ["persistentvolumes", "persistentvolumeclaims"]
+    resources: ["persistentvolumes"]
     verbs: ["get", "list", "watch", "create", "delete", "update", "patch"]
+  - apiGroups: [""]
+    resources: ["persistentvolumeclaims"]
+    verbs: ["get", "list", "watch", "update", "patch"]
   - apiGroups: [""]
     resources: ["persistentvolumeclaims/status"]
     verbs: ["update", "patch"]
   - apiGroups: ["storage.k8s.io"]
     resources: ["storageclasses"]
-    # Write permissions only needed for operator?
-    verbs: ["get", "list", "watch", "create", "delete", "update", "patch"]
+    verbs: ["get", "list", "watch"]
   - apiGroups: [""]
     resources: ["events"]
     verbs: ["get", "list", "watch", "create", "update", "patch"]
-  # Can reduce to only trident namespace
-  - apiGroups: [""]
-    resources: ["secrets"]
-    verbs: ["get", "list", "watch", "create", "delete", "update", "patch"]
-  # Only needed for operator?
-  - apiGroups: [""]
-    resources: ["pods"]
-    verbs: ["get", "list", "watch"]
-  - apiGroups: [""]
-    resources: ["pods/log"]
-    verbs: ["get", "list", "watch"]
   - apiGroups: [""]
     resources: ["nodes"]
-   # Update may have been leftover from alpha/beta topology feature
-    verbs: ["get", "list", "watch", "update"]
+    verbs: ["get", "list", "watch"]
   - apiGroups: ["storage.k8s.io"]
     resources: ["volumeattachments"]
     verbs: ["get", "list", "watch", "update", "patch"]
@@ -41,31 +33,19 @@ rules:
     verbs: ["update", "patch"]
   - apiGroups: ["snapshot.storage.k8s.io"]
     resources: ["volumesnapshots", "volumesnapshotclasses"]
-   # VolumeSnapshot permissions not needed at all since beta.
-  # VolumeSnapshotClass write permissions only for operator?
-    verbs: ["get", "list", "watch", "update", "patch"]
+    verbs: ["get", "list"]
   - apiGroups: ["snapshot.storage.k8s.io"]
     resources: ["volumesnapshots/status", "volumesnapshotcontents/status"]
     verbs: ["update", "patch"]
   - apiGroups: ["snapshot.storage.k8s.io"]
     resources: ["volumesnapshotcontents"]
-   # Don't need "create/delete"
-    verbs: ["get", "list", "watch", "create", "delete", "update", "patch"]
-  # No longer need csi alpha CRDs
-  - apiGroups: ["csi.storage.k8s.io"]
-    resources: ["csidrivers", "csinodeinfos"]
-    verbs: ["get", "list", "watch", "create", "delete", "update", "patch"]
+    verbs: ["get", "list", "watch", "update", "patch"]
   - apiGroups: ["storage.k8s.io"]
-   # CSIDriver not handled by any sidecar anymore
-   # CSINode only needs read permissions
-    resources: ["csidrivers", "csinodes"]
-    verbs: ["get", "list", "watch", "create", "delete", "update", "patch"]
-   # Only needed by operator?
-  - apiGroups: ["apiextensions.k8s.io"]
-    resources: ["customresourcedefinitions"]
-    verbs: ["get", "list", "watch", "create", "delete", "update", "patch"]
+    resources: ["csinodes"]
+    verbs: ["get", "list", "watch"]
   - apiGroups: ["trident.netapp.io"]
     resources: ["tridentversions", "tridentbackends", "tridentstorageclasses", "tridentvolumes","tridentnodes",
 "tridenttransactions", "tridentsnapshots", "tridentbackendconfigs", "tridentbackendconfigs/status",
 "tridentmirrorrelationships", "tridentmirrorrelationships/status", "tridentsnapshotinfos",
 "tridentsnapshotinfos/status"]
     verbs: ["get", "list", "watch", "create", "delete", "update", "patch"]
-  # PSP is deprecated
-  - apiGroups: ["policy"]
-    resources: ["podsecuritypolicies"]
-    verbs: ["use"]
-    resourceNames:
-      - tridentpods

--- /dev/null
+++ b/trident-role.yaml
@@ -0,0 +1,10 @@
+---
+kind: Role
+apiVersion: rbac.authorization.k8s.io/v1
+metadata:
+  name: trident-controller-csi
+rules:
+ - apiGroups: [""]
+   resources: ["secrets"]
+   verbs: ["get", "list", "watch", "create", "delete", "update", "patch"]

Separating out controller and daemonset service accounts:

+---
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: trident-node-csi

+++ b//trident-daemonset.yaml
@@ -15,7 +15,7 @@ spec:
       labels:
         app: node.csi.trident.netapp.io
     spec:
-      serviceAccount: trident-csi
+      serviceAccount: trident-node-csi
       hostNetwork: true
gnarl commented 2 years ago

Hi @msau42,

Thanks for opening this issue. With the Trident 22.07 release we completed an investigation of the permissions needed by Trident. This primarily focused on the minimal set of permissions needed by the Trident daemonset Pods. This set of permissions is largely influenced by Kubernetes, the CSI Spec, and Linux.

We are planning on separating out the controller and daemonset service accounts for the Trident 22.10 release as you have suggested.

The team has also been discussing how to reduce RBAC permissions in both the Operator and when using tridentctl to install Trident. There are more RBAC permissions that can potentially be reduced by removing CRD creation from the Operator's install functionality. Many of the RBAC permissions that are currently needed in the Operator are only required at install/upgrade time. We do have some customers that set the Operator's replicas value to 0 after completing an upgrade as a mitigation strategy.

Thanks for sharing the diff of RBAC changes that you've found work for your Trident deployments. It is a good example of another way to separate RBAC permissions depending on the customer's need to use the Operator or not.

msau42 commented 2 years ago

Glad to hear, thanks!

gnarl commented 1 year ago

Suggested RBAC improvements were completed with the Trident v23.01 release. Please let us know if these changes don't meet with expectations.