A potential risk in piraeus which can be leveraged to make a cluster-level priviege escalation

younaman commented 1 year ago

I am Nanzi Yang, and I find a potential risk in piraeus which can be leveraged to make a cluster-level privilege escalation.

Detailed analysis: The piraeus has one deployment called piraeus-op-controller-manager, which has two pods running on worker nodes randomly. The pod's service account is piraeus-op, which has the piraeus-op-controller-manager cluster role via cluster role binding. The cluster role has get/list/watch verbs of secret resources, has create/patch/update verbs of clusterrolebindings.rbac.authorization.k8s.io, and has create/patch/update verbs of clusterroles.rbac.authorization.k8s.io. Thus if a malicious user can access the worker node which has piraeus-op-controller-manager:

He/she can leverage the service account to get ALL secrets in the entire cluster (e.g., the cluster's admin token), resulting in cluster-level privilege escalation.
He/she can leverage the service account to bind a high-privileged cluster role (e.g., cluster-admin cluster role) to whatever service account he/she likes, resulting in cluster-level privilege escalation.
He/she can modify any existing cluster role to get high-privileged permissions (e.g., GET verbs of secret resources), resulting in cluster-level privilege escalation.

Mitigation Discussion:

For the secret-related verbs, the piraeus' maintainers can create a separate Kubernetes namespace, and use the RoleBinding, not the ClusterRoleBinding to restrain the deployment can only access secrets in the separate namespace.
For the secret-related verbs, the piraeus' maintainers can also use resource names to restrain the secrets that can be accessed by the deployment.
For the ClusterRole&ClusterRoleBinding related verbs, perhaps the best way to mitigate the risks is removing the related permissions. However, it needs a careful review of piraeus' source code without disrupting its functionalities.

A few questions:

Is it a real issue in piraeus?
If it's a real issue, can piraeus mitigate the risks following my suggestions?

WanzenBug commented 1 year ago

Thank you for the report. We are aware of the potential impact of a compromised piraeus-operator-controller-manager deployment. However, we see limited possibility to change that:

The Piraeus Operator manages all resources related to Piraeus Datastore, which, at the base level, is a Storage Provider for Kubernetes using the CSI mechanism. One of the main operations a storage provider must be able to do is using the "mount()" system call, which already requires local root privileges on a node. Then, we also need to manage DRBD resources, which also require the SYS_ADMIN capability. This means we necessarily need to run high-privileged workloads.

On to the reported issues:

cluster-wide reading of secrets is necessary to enable users to use secrets in the storage class [1]. We use those secrets so users can securely store information about the their backup location.
The piraeus-operator deployment itself does not use these permissions, but it creates the service account and RBAC resources necessary for the storage provider to run. One way Kubernetes protects against privilege escalation is that a ServiceAccount can only create a ClusterRule resource for things it can access itself. So in order for the piraeus-operator to create the rules for the csi-provisioner, it needs to have the cluster-wide secret read permission as well.

Thus, I do not see any way to mitigate this potential issue. All of the provided mitigations would impact the functionality in some way. If you do have further suggestions, we'd be interested in hearing them.

zerotens commented 1 year ago

The harder part is to mitigate most attack scenarios. To check for problematic rbac permissions: https://github.com/PaloAltoNetworks/rbac-police

./rbac-police_v1.1.2_linux_amd64 eval lib/ -n piraeus-datastor
...
    "summary": {
        "failed": 8,
        "passed": 15,
        "errors": 0,
        "evaluated": 23
    }

To mitigate all you would have delegate and verify risky rbac permissions to an protected workload on an protected node which does those risky actions on behalf of the workloads which are running on the worker nodes to provide storage.

https://kubernetes.io/docs/tasks/administer-cluster/securing-a-cluster/#controlling-which-nodes-pods-may-access

To be fair: Piraeus is not alone with this problem.

https://i.blackhat.com/USA-22/Thursday/US-22-Avrahami-Kubernetes-Privilege-Escalation-Container-Escape-Cluster-Admin.pdf

piraeusdatastore / piraeus-operator

A potential risk in piraeus which can be leveraged to make a cluster-level priviege escalation #449