rancher / cluster-api-provider-rke2

RKE2 bootstrap and control-plane Cluster API providers.
Apache License 2.0
82 stars 28 forks source link

[SURE-9154] cattle-cluster-agent does not start after apply cisProfile: cis #457

Open alexander-demicev opened 3 days ago

alexander-demicev commented 3 days ago

Issue description: after apply cisProfile: cis to a deployed or a new rke2 CAPV cluster, cattle-cluster-agent does not start.

Business impact: Can't apply cisProfile.

Troubleshooting steps: Event:

Type Reason Age From Message


Warning FailedCreate 10m (x43 over 124m) statefulset-controller create Pod fleet-agent-0 in StatefulSet fleet-agent failed error: pods "fleet-agent-0" is forbidden: violates PodSecurity "restricted:latest": seccompProfile (pod or containers "fleet-agent-register", "fleet-agent", "fleet-agent-clusterstatus" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

Related:

https://github.com/rancher/rancher/issues/47255

If we try to import the CAPV rke2 cluster manually fixing the warnings at manifest:

Warning: spec.template.spec.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[0].matchExpressions[0].key: beta.kubernetes.io/os is deprecated since v1.14; use "kubernetes.io/os" instead

Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "cluster-register" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "cluster-register" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "cluster-register" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "cluster-register" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

Repro steps:

From the error message in the cattle-cluster-agent pod, the PodSecurityAdmission profile applied in this cluster is 'restricted'

fleet-agent failed error: pods "fleet-agent-0" is forbidden: violates PodSecurity "restricted:latest" : seccompProfile (pod or containers "fleet-agent-register", "fleet-agent", "fleet-agent-clusterstatus" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")The PSA profile 'restricted' is from the upstream k8s, which is by default applied in 1.25 as a replacement of Podsecuritypoliciies. You can read more in the RKE2 PSS section - [1]

I've reproduced your issue by:

Create a custom RKE2 1.30 cluster with cis profile enabled Import this custom cluster into Rancher

root@rke2-custom-cis-aaller-02:~# curl --insecure -sfL https://xxxx/v3/import/xxx.yaml
kubectl apply -f -clusterrole.rbac.authorization.k8s.io/proxy-clusterrole-kubeapiserver
createdclusterrolebinding.rbac.authorization.k8s.io/proxy-role-binding-kubernetes-master
created
namespace/cattle-system created
serviceaccount/cattle createdclusterrolebinding.rbac.authorization.k8s.io/cattle-admin-binding
created
secret/cattle-credentials-c243511 createdclusterrole.rbac.authorization.k8s.io/cattle-admin
created*Warning: spec.template.spec.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[0].matchExpressions[0].key: beta.kubernetes.io/os is deprecated since v1.14; use "kubernetes.io/os" instead Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "cluster-register" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "cluster-register" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "cluster-register" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "cluster-register" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost") deployment.apps/cattle-cluster-agent created*
service/cattle-cluster-agent created
root@rke2-custom-cis-aaller-02:~#
There are no pods in the cattle-system ns
root@rke2-custom-cis-aaller-02:~# kubectl get pods -n cattle-system
No resources found in cattle-system namespace.
root@rke2-custom-cis-aaller-02:~#

To work around this:

a) +Create a new Admission config file:+ /etc/rancher/rke2/rke2-pss-custom.yaml.

Create your custom config for the PodSecurity Admision,  adding required namespaces to the exemptions=>namespaces.

You could have this -[2] - as a guide of the namespaces Rancher considered as required. In the following example I've only added cattle-system and cattle-fleet-system as additional ns. apiVersion: apiserver.config.k8s.io/v1 kind: AdmissionConfiguration plugins:

name: PodSecurity configuration: apiVersion: pod-security.admission.config.k8s.io/v1beta1 kind: PodSecurityConfiguration defaults: enforce: "restricted" enforce-version: "latest" audit: "restricted" audit-version: "latest" warn: "restricted" warn-version: "latest" exemptions: usernames: [] runtimeClasses: [] namespaces: [cattle-system, cattle-fleet-system, kube-system, cis-operator-system, tigera-operator] b) Restart RKE2:

[1] - PodSecurityStandards in RKE2 [2] - Rancher privileged PSA profile

Workaround: Is a workaround available and implemented? not yet.

alexander-demicev commented 3 days ago

more info on topic https://docs.rke2.io/security/pod_security_standards