rancher / rke2

https://docs.rke2.io/
Apache License 2.0
1.49k stars 260 forks source link

`rke2-cloud-controller-manager` RB/CRBs subject lists grow without bound #6272

Closed yoderme closed 3 weeks ago

yoderme commented 1 month ago

Environmental Info: RKE2 Version:

rke2 version v1.28.6+rke2r1 (572f367fd08e0e650a278477fdeea551dba48af7)

Node(s) CPU architecture, OS, and Version:

Linux (redacted) 5.15.0-105-generic #115-Ubuntu SMP Mon Apr 15 09:52:04 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:

24 nodes.

Describe the bug:

In the kube-system namespace the rke2-cloud-controller-manager-authentication-reader RoleBinding appears to grow without bound. We have an older cluster where there are nearly 20,000 subjects, all alike.

❯ kubectl -n kube-system get rolebinding rke2-cloud-controller-manager-authentication-reader -o yaml | head -n 50
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: rke2-cloud-controller-manager-authentication-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: rke2-cloud-controller-manager
  namespace: kube-system
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: rke2-cloud-controller-manager
  namespace: kube-system
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: rke2-cloud-controller-manager
  namespace: kube-system
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: rke2-cloud-controller-manager
  namespace: kube-system
...

This just keeps going and going with the same subject over and over again.

We have other clusters that are younger; they have a smaller number of the subjects in that RoleBinding but there are still an unhealthy number.

The role it's referring to is

❯ kubectl --context c048 -n kube-system get role extension-apiserver-authentication-reader -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: extension-apiserver-authentication-reader
  namespace: kube-system
rules:
- apiGroups:
  - ""
  resourceNames:
  - extension-apiserver-authentication
  resources:
  - configmaps
  verbs:
  - get
  - list
  - watch

We think a a dup subject is added every time rke2-server is restarted.

Steps To Reproduce:

Just wait. Restart rke2-server if you like.

Expected behavior:

Just one subject. ;-)

Actual behavior:

As above.

brandond commented 1 month ago

That's odd... we're using the core Kubernetes RBAC policy helpers to generate the desired config, I'm not sure why it's patching the same subject into the group over and over instead of just setting it to the desired list. I wonder if this is a bug in upstream's helper code, or if we're just calling it wrong...

mechpen commented 1 month ago

hi @brandond, this could be a potential fix. The k8s.io library does not automatically add the APIGroup field when reconciling.

index b0235362..5268789a 100644
--- a/pkg/rke2/clusterrole_bootstrap.go
+++ b/pkg/rke2/clusterrole_bootstrap.go
@@ -102,7 +102,7 @@ func roleBindings() map[string][]rbacv1.RoleBinding {
 // For some reason the core helpers don't have any methods for adding namespaced users, only namespaced service accounts.
 func RoleBindingNamespacedUsers(r *rbacv1helpers.RoleBindingBuilder, namespace string, users ...string) *rbacv1helpers.RoleBindingBuilder {
        for _, user := range users {
-               r.RoleBinding.Subjects = append(r.RoleBinding.Subjects, rbacv1.Subject{Kind: rbacv1.UserKind, Namespace: namespace, Name: user})
+               r.RoleBinding.Subjects = append(r.RoleBinding.Subjects, rbacv1.Subject{APIGroup: rbacv1.GroupName, Kind: rbacv1.UserKind, Namespace: namespace, Name: user})
        }
        return r
 }
@@ -119,7 +119,7 @@ func RoleBindingName(r *rbacv1helpers.RoleBindingBuilder, name string) *rbacv1he
 // For some reason the core helpers don't have any methods for adding namespaced users, only namespaced service accounts.
 func ClusterRoleBindingNamespacedUsers(r *rbacv1helpers.ClusterRoleBindingBuilder, namespace string, users ...string) *rbacv1helpers.ClusterRoleBindingBuilder {
        for _, user := range users {
-               r.ClusterRoleBinding.Subjects = append(r.ClusterRoleBinding.Subjects, rbacv1.Subject{Kind: rbacv1.UserKind, Namespace: namespace, Name: user})
+               r.ClusterRoleBinding.Subjects = append(r.ClusterRoleBinding.Subjects, rbacv1.Subject{APIGroup: rbacv1.GroupName, Kind: rbacv1.UserKind, Namespace: namespace, Name: user})
        }
        return r
 }
brandond commented 1 month ago

I think you're probably right, since the comparison used by the helper just does a literal object compare which would fail if the apigroup and kind don't match... https://github.com/kubernetes/kubernetes/blob/79fee524e65ddc7c1448d5d2554c6f91233cf98d/staging/src/k8s.io/component-helpers/auth/rbac/reconciliation/reconcile_rolebindings.go#L219-L226

This same problem will also prevent the helper from cleaning the duplicates once we fix this, since technically it doesn't reset the list, it just removes anything that's not wanted, and adds stuff that is. It doesn't care if something that is wanted is in there more than once.

brandond commented 1 month ago

Ugh this also affects the ClusterRoleBindings rke2-cloud-controller-manager and rke2-cloud-controller-manager-auth-delegator too.

We'll have to figure out if this is something that we need to actively clean up, rather than just providing example commands to remediate.