Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.97k stars 307 forks source link

Does AKS supports pod security admission at cluster level? #3684

Open smartaquarius10 opened 1 year ago

smartaquarius10 commented 1 year ago

Hello,

I have 1 import question and 2 follow-up questions regarding PSA

  1. Is it possible to create Pod security admission at cluster level in Azure Kubernetes as described here. If yes, then

    1. How to add usernames in exemptions section in case of RBAC based AKS

      1. Should we add AD group object id in the array

Please suggest. Thank you.

Note:- I am already aware that we can add them at namespace level. This question is dedicatedly for cluster level PSA

Kind Regards, Tanul

ghost commented 1 year ago

Hi smartaquarius10, AKS bot here :wave: Thank you for posting on the AKS Repo, I'll do my best to get a kind human from the AKS team to assist you.

I might be just a bot, but I'm told my suggestions are normally quite good, as such: 1) If this case is urgent, please open a Support Request so that our 24/7 support team may help you faster. 2) Please abide by the AKS repo Guidelines and Code of Conduct. 3) If you're having an issue, could it be described on the AKS Troubleshooting guides or AKS Diagnostics? 4) Make sure your subscribed to the AKS Release Notes to keep up to date with all that's new on AKS. 5) Make sure there isn't a duplicate of this issue already reported. If there is, feel free to close this one and '+1' the existing issue. 6) If you have a question, do take a look at our AKS FAQ. We place the most common ones there!

ghost commented 1 year ago

Triage required from @Azure/aks-pm

ghost commented 1 year ago

Action required from @CocoWang-wql, @miwithro, @charleswool.

miwithro commented 1 year ago

@smartaquarius10 we don't have a way to use exceptions. Can you expand on the use case you are looking for?

smartaquarius10 commented 1 year ago

@miwithro That is not a problem. But does AKS support cluster level PSA.. I tried making it but got an error that CRD for kind AdmissionConfiguration is unavailable.

smartaquarius10 commented 1 year ago

How to create this pod security admission at cluster level in AKS

bravebeaver commented 1 year ago

according to the docs, you might need to have azure-policy addons for aks

smartaquarius10 commented 1 year ago

@bravebeaver Nope still not working. I have enabled the azure policy.. Still this error is coming.

error: resource mapping not found for name: "" namespace: "" from "psa.yaml": no matches for kind "AdmissionConfiguration" in version "apiserver.config.k8s.io/v1"
ensure CRDs are installed first

This is the PSA.yaml

apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: DefaultPodSecurity
  configuration:
    apiVersion: pod-security.admission.config.k8s.io/v1
    kind: PodSecurityConfiguration
    defaults:
      enforce: "restricted"
      enforce-version: v1.25
      audit: "restricted"
      audit-version: v1.25
      warn: "restricted"
      warn-version: v1.25
    exemptions:
      usernames: []
      runtimeClassNames: []
      namespaces: [kube-system, gatekeeper-system, azure-arc, azure-extensions-usage-system]

Azure policy is enabled

image

Constraint templates

image

smartaquarius10 commented 1 year ago

@miwithro Once free could you please suggest if it is possible to make cluster level pod security admission in AKS

tspearconquest commented 1 year ago

Hi @smartaquarius10 fellow user here interested in the feature.

I think it is currently unsupported in AKS.

In the link you shared in your OP: https://kubernetes.io/docs/tutorials/security/cluster-level-pss/#set-modes-versions-and-standards, step 4 clearly indicates that this configuration is intended to be consumed by the API server during cluster creation

The instructions indicate to output the AdmissionConfiguration to a file in /tmp on a node before you create the cluster, and then point to that file with the manifest in step 4. This is all done on the node before the cluster is created.

AKS nodes come pre-configured with Kubernetes already running, therefore we have no opportunity to add this configuration to the cluster.

smartaquarius10 commented 1 year ago

@tspearconquest, Got it thanks.

tspearconquest commented 1 year ago

It could be a good feature request. I would like to see this reopened. Azure could provide a way for us to configure this via az cli or terraform :)

smartaquarius10 commented 1 year ago

@tspearconquest reopened

bravebeaver commented 1 year ago

hi @smartaquarius10,

apologies as i took a short break. if i read your question correctly you might want to try using using-pod-security-admission? so basically instead of a yaml file you can do

kubectl label --overwrite ns --all pod-security.kubernetes.io/enforce=restricted

it looks to me that it supports all namespaces, and needs a bit tweaking to exclude the 4 namespaces 🤔. assuming you wont want to create pods in those 4 namespaces anyway?

i had a feeling that you might be in the process of migrating from pod security policy and the built-in policies might be less helpful.

disclaimer: i am commenting from a personal capacity only. 🙏

smartaquarius10 commented 1 year ago

@bravebeaver, Thank you for sharing but, I think you misunderstood my question. We know that applying at namespace level is already possible. My question is strictly for cluster level. The one I have mentioned here

smartaquarius10 commented 1 year ago

@bravebeaver I have updated my question.. would be easy for everyone now..

The most important question is enabling at cluster level.

smartaquarius10 commented 1 year ago

I'll try this approach. Let see if it works with AKS.

However, I don't know it is a best practice or not

bravebeaver commented 1 year ago

@bravebeaver I have updated my question.. would be easy for everyone now..

The most important question is enabling at cluster level.

just wondering, would all namespaces make the cut for "cluster" level?

bravebeaver commented 1 year ago

I'll try this approach. Let see if it works with AKS.

However, I don't know it is a best practice or not

reading the official documentation, i think the configMap approach should work as AKS supports validating webhooks.

one caveat is indeed the version of the cluster, PodSecurityPolicy is deprecated for newer versions.

smartaquarius10 commented 1 year ago

As far as I know PSP are deprecated but PSA is not

smartaquarius10 commented 1 year ago

I'll try this approach. Let see if it works with AKS. However, I don't know it is a best practice or not

reading the official documentation, i think the configMap approach should work as AKS supports validating webhooks.

one caveat is indeed the version of the cluster, PodSecurityPolicy is deprecated for newer versions.

But do we need to deploy pod security admission webhook for that as mentioned in stack overflow link

smartaquarius10 commented 1 year ago

I'll try this approach. Let see if it works with AKS. However, I don't know it is a best practice or not

reading the official documentation, i think the configMap approach should work as AKS supports validating webhooks.

one caveat is indeed the version of the cluster, PodSecurityPolicy is deprecated for newer versions.

Is there anyway to use this configmap without any 3rd party apps like these webhook etc. Something like already available in AKS to pull that configmap and map it with api-server

bravebeaver commented 1 year ago

agreed. i have not found any official documentation just yet and let me get back to you (soonish).

smartaquarius10 commented 1 year ago

@bravebeaver sure thanks🙂

OmarHawk commented 1 year ago

Also interested in this. Applying to every namespace the very same baseline seems quite odd... and depending how you use the cluster also kind of risky to "forget" a new namespace...

tspearconquest commented 1 year ago

The problem with setting cluster-level permissions is that they will even apply to kube-system. In AKS, Azure deploys the pods to the kube-system namespace and we don't have control over it. So if we as the users of the cluster, provision a cluster-wide policy which is too restrictive, it could break the cluster because we don't own the pods in kube-system, and don't have access to the master nodes.

OmarHawk commented 1 year ago

That's why there are exemptions (see https://kubernetes.io/docs/concepts/security/pod-security-admission/#exemptions) - and these wouldn't change super often. But in the way, we do use our k8s cluster, new (actual workload) namespaces get created quite often...

and someone just "forgetting" the labels means the pods in the namespace are not on the same baseline as the others and run at risk of being misconfigured... Having one global policy defined on cluster level with a certain set of exemptions (like kube-system) seems very much less error-prone and much less work...

tspearconquest commented 1 year ago

Oh I see, I missed the piece about using the PSA webhook. Apologies!

In the docs for that webhook, I can see some issues that may prevent its use in AKS:

Source: https://github.com/kubernetes/pod-security-admission/blob/master/webhook/README.md

The webhook is suitable for environments where the built-in PodSecurity admission controller cannot be used.

Most importantly:

It is built with the same Go package as the Pod Security Admission Controller.

In AKS, the PSA controller is enabled by default and there is no way to disable it. Because they use the same go package, the library used for the webhook is actually the same code as what is built-in. The difference is that the built-in code doesn't run as a webhook because it's built-in to the API server itself, whereas the webhook runs as an in-cluster ValidatingWebhookConfiguration resource.

The purpose of the webhook is to provide support for PSA in older clusters (pre-1.21) where the PSA controller did not exist, and so could not be enabled.

Sadly, because they use the same underlying code, I have a strong suspicion that trying to deploy the webhook to a cluster with PSA Controller enabled will cause a conflict which breaks the cluster. Because the actual PSA resource you deploy at a cluster-wide level would be picked up by both the PSA Controller and the PSA Webhook, at the same time, they will both try to validate incoming workloads but with each having a different configuration. Unfortunately, I don't have the option to test it myself, as my clusters are all actively in use by other team members, but I definitely recommend treading with caution and testing it thoroughly on a non-essential cluster.

OmarHawk commented 1 year ago

Actually, we'd rather rely on something that has not the potential to break a cluster and that is approved / supported by Microsoft itself :P

tspearconquest commented 1 year ago

For sure, I feel the same. Microsoft should offer a way to configure the kind: AdmissionConfiguration PodSecurityConfiguration plugin's kind: PodSecurityConfiguration resource defaults and exemptions blocks via (at minimum) the Azure CLI and Azure API (for Terraform, Pulumi, etc) in order to accomplish this.

papanito commented 1 year ago

Also interested in this. Applying to every namespace the very same baseline seems quite odd... and depending how you use the cluster also kind of risky to "forget" a new namespace...

I also would prefer cluster wide config, but as a workaround you may use something like opa gatekeeper


apiVersion: mutations.gatekeeper.sh/v1alpha1
kind: AssignMetadata
metadata:
  name: ns-assign-label-podsecurity
spec:
  match:
    scope: Cluster
    kinds:
      - apiGroups: ["*"]
        kinds: ["Namespace"]
  location: 'metadata.labels."pod-security.kubernetes.io/enforce"'
  parameters:
    assign:
      value: restricted
CocoWang-wql commented 1 year ago

Hello all, thanks for your feedback. We will consider this feature request.

smartaquarius10 commented 1 year ago

@CocoWang-wql Thanks for helping. Let us know once the feature is releases. Currently we have added this within the namespace definition using automation deployments. But at cluster level would be really helpful.

Just one more feedback, it would be great if it can support RBAC as well. Because, after enabling cluster wide restrictions debugging of nodes using this command might not be possible.

kubectl debug node/aks-nodepool1-37663765-vmss000000 -it --image=mcr.microsoft.com/dotnet/runtime-deps:6.0

If RBAC is integrated, then we can add an azure ad group in the exemption list which can help admins managing the debugging of nodes. Thank you.

Kind Regards, Tanul