Kubearmor fetches wrong pod name/namespace to apply whitelist policy on long running GKE clusters (BPF-LSM enforcement)

kubearmor / KubeArmor

Runtime Security Enforcement System. Workload hardening/sandboxing and implementing least-permissive policies made easy leveraging LSMs (BPF-LSM, AppArmor).

https://kubearmor.io/

Apache License 2.0

1.5k stars 344 forks source link

Kubearmor fetches wrong pod name/namespace to apply whitelist policy on long running GKE clusters (BPF-LSM enforcement) #1780

Open gusfcarvalho opened 5 months ago

gusfcarvalho commented 5 months ago

Bug Report

General Information

Issue

This is a consistent issue when running kubearmor on any long-lived cluster. We have a set of policies in protected-namespace where we whitelist only a few pods based on labels:

Config

apiVersion: security.kubearmor.com/v1
kind: KubeArmorPolicy
metadata:
  name: default-deny
  namespace: protected-namespace
spec:
  action: Block
  process:
    matchDirectories:
    - dir: /
      recursive: true
  selector:
    matchLabels: {}
---
apiVersion: security.kubearmor.com/v1
kind: KubeArmorPolicy
metadata:
  name: allow-app
  namespace: protected-namespace
spec:
  action: Allow
  file:
    matchDirectories:
    - dir: /
      recursive: true
  process:
    matchDirectories:
    ## list of matchDirectories here
    matchPaths:
    ## list of allowed paths here
  selector:
    matchLabels:
      kubearmor-whitelist-profile: app

Symptom

A pod from unprotected-namespace cannot run due to not being able to run correct-binary: permission-denied

Extra information

With this configuration, after a time with kubearmor running on cluster, kubearmor starts to deny applications in unprotected-namespaces as well, even though there are no KubearmorPolicy objects on these namespace:

From karmor logs, I could see:

{
"Timestamp":1718029258,
"UpdatedTime":"2024-06-10T14:20:58.078820Z",
"ClusterName":"default",
"HostName":"***-bz88",
"NamespaceName":"protected",
"Owner":{"Ref":"Pod","Name":"wrong-pod-name","Namespace":"protected"},
"PodName":"wrong-pod-name",
"Labels":"wrong-pod-labels",
"ContainerID":"wrong-id",
"ContainerName":"wrong-container",
"ContainerImage":"wrong-image",
"ProcessName":"/binary-from-unprotected-pod",
"PolicyName":"default-deny",
"Severity":"1",
"Type":"MatchedPolicy",
"Operation":"Process",
"Resource":"/binary-from-unprotected-pod",
"Data":"lsm=SECURITY_BPRM_CHECK",
"Enforcer":"BPFLSM",
"Action":"Block",
"Result":"Permission denied",
"Cwd":"/"
}

This is recurring on any clsuter we have kubearmor operator running. As a workaround, if we kubectl delete pods --all -n <kubearmor-namespace>, the system goes back to running as expected (until a few days/weeks later, the issue restarts).

Versions

Kubearmor: v1.3.4 Cluster: gke 1.27.13-gke.1000000

Expected behavior

I would expect kubearmor to consistently get the information of the correct pods' names and namespaces :).

daemon1024 commented 5 months ago

Hey @gusfcarvalho

I want to try reproducing this issue, can you inform me about the scale of the cluster like number of nodes and number of pods/node. So that we can replicate this scenario, because this is not reproducible in our normal test clusters.

If by chance possible, Can you redact sensitive information and share logs for kubearmor pods.

Thanks!

gusfcarvalho commented 5 months ago

we see this issue with a cluster with 8/9 nodes; running about 60 pods across 10 namespaces. We also see it on bigger clusters as well.

The main pain point is that it works on a 'fresh' pod - it takes about a few days or so for the issue to kick in (so I would expect it to not be seen on any e2e tests)

gusfcarvalho commented 5 months ago

If by chance possible, Can you redact sensitive information and share logs for kubearmor pods.

sure! this issue is so recurring that's actually easy to fetch them. Do you need logs from kubearmor-bpf-containerd only?

from what I can see, pod logs only contain several of:

2024-06-11 05:24:33.556758      INFO    Detected a Pod (deleted/<>)
2024-06-11 05:24:42.843611      INFO    Successfully deleted visibility map with key={PidNS:40265*** MntNS:40265***} from the kernel
2024-06-11 05:24:42.850623      INFO    Detected a container (<>)
2024-06-11 05:25:46.128971      INFO    Updating container rules for aeb***
2024-06-11 05:25:47.422789      INFO    Detected a Pod (modified/<>)

the only information that I can see is performing wrong is with karmor. My guess is somehow the keys are getting mismatched.

gusfcarvalho commented 5 months ago

This issue is still persisting. Any updates? Feel free to dm me in kubernetes slack on @gusfcarvalho

gusfcarvalho commented 4 months ago

Hello! 😄 any updates on this?

carlosrodfern commented 3 months ago

I'm having a similar issue. I have a policy with a selector matching specific labels within a specific namespace. Two days later, I began to see relay alert logs pointing to policy violations in a different namespace, and on containers with no labels matching the selector. I'm running kubearmor v1.4.0, and on 17 nodes at his moment. On AWS EKS.

carlosrodfern commented 3 months ago

I deleted all the pods in the kube armor namespace, and some hours later, it was already misapplying policies.

A very simple policy:

apiVersion: security.kubearmor.com/v1
kind: KubeArmorPolicy
metadata:
  name: generic-maint-tools-access
  namespace: myapplication
spec:
  action: Audit
  message: Restricted maintenance tool access attempt detected
  process:
    matchDirectories:
      - dir: /sbin/
        recursive: true
      - dir: /usr/sbin/
        recursive: true
  selector:
    matchLabels:
      security: generic
  severity: 1
  tags:
    - PCI_DSS
    - MITRE
    - MITRE_T1553_Subvert_Trust_Controls

After recreating the pods, all bpf-containerd pods log this:

Detected a Security Policy (added/myapplication/generic-maint-tools-access)

However, a few hours later, I get these logs:

{
...
  "NamespaceName": "anothernamespace",
  "Owner":
    { "Ref": "StatefulSet", "Name": "somests", "Namespace": "anothernamespace" },
  "PodName": "somepod-2",
  "Labels": "...",
  "ContainerID": "...",
  "ContainerName": "somecontainername",
  "ContainerImage": "...",
  "HostPPID": ...,
  "HostPID": ...,
  "PPID": ...,
  "PID": ...,
  "UID": ...,
  "ParentProcessName": "...",
  "ProcessName": "...",
  "PolicyName": "generic-maint-tools-access",
  "Severity": "1",
  "Tags": "PCI_DSS,MITRE,MITRE_T1553_Subvert_Trust_Controls",
  "ATags": ["PCI_DSS", "MITRE", "MITRE_T1553_Subvert_Trust_Controls"],
  "Message": "Restricted maintenance tool access attempt detected",
  "Type": "MatchedPolicy",
  "Source": "....",
  "Operation": "Process",
  "Resource": "...",
  "Data": "syscall=SYS_EXECVE",
  "Enforcer": "eBPF Monitor",
  "Action": "Audit",
  "Result": "Passed",
  "Cwd": "/",
  ...
}

I have no other policy named generic-maint-tools-access in any other namespace, or cluster policy named like that.

carlosrodfern commented 3 months ago

It looks like the bug is right here?

https://github.com/kubearmor/KubeArmor/blob/7e7b1c390316970b42ddfc13b725187ef0c4b9f1/KubeArmor/core/kubeUpdate.go#L1016

It adds the policy to the return if this returns true || matchClusterSecurityPolicyRule(policy)

...but that function doesn't check whether the passed policy is a cluster policy, and when the matchExpressions is empty, it ends up adding one namespace (whatever comes back in the k8s client response first that hasn't been added yet) to NamespaceList of all existing policies (cluster or not), it then returns true and the policy is added to the GetSecurityPolicies(..) response. It appears that over time, as matchClusterSecurityPolicyRule(..) is called, the list of NamespaceList in each policy keeps increasing, by one added ns at a time, which can explain the fact that for @gusfcarvalho it takes time to see this behavior.

Should then matchClusterSecurityPolicyRule(..) check for the policy type, and also receive the namespaceName to match agaisnt the cluster policy, and fix the updating of NamespaceList?

I may be missing something.

carlosrodfern commented 3 months ago

It looks like the bug is right here?

https://github.com/kubearmor/KubeArmor/blob/7e7b1c390316970b42ddfc13b725187ef0c4b9f1/KubeArmor/core/kubeUpdate.go#L1016

It adds the policy to the return if this returns true || matchClusterSecurityPolicyRule(policy)

...but that function doesn't check whether the passed policy is a cluster policy, and when the matchExpressions is empty, it ends up adding one namespace (whatever comes back in the k8s client response first that hasn't been added yet) to NamespaceList of all existing policies (cluster or not), it then returns true and the policy is added to the GetSecurityPolicies(..) response. It appears that over time, as matchClusterSecurityPolicyRule(..) is called, the list of NamespaceList in each policy keeps increasing, by one added ns at a time, which can explain the fact that for @gusfcarvalho it takes time to see this behavior.

Should then matchClusterSecurityPolicyRule(..) check for the policy type, and also receive the namespaceName to match agaisnt the cluster policy, and fix the updating of NamespaceList?

I may be missing something.

Actually, for what I can understand from CreateSecurityPolicy(..) and how it is used, the cluster policies are built already with the NamespaceList properly initialized, so the function matchClusterSecurityPolicyRule(policy) could be just removed, and let the || kl.ContainsElement(policy.Spec.Selector.NamespaceList, namespaceName) already present do the matching.

I'm trying to test that change but I'm having issues with the build using the Dockerfile: bpf.c:28:10: fatal error: asm/unistd.h: No such file or directory... Hoping the maintainers can chime in soon.

carlosrodfern commented 3 months ago

It looks like the bug is right here? https://github.com/kubearmor/KubeArmor/blob/7e7b1c390316970b42ddfc13b725187ef0c4b9f1/KubeArmor/core/kubeUpdate.go#L1016

It adds the policy to the return if this returns true || matchClusterSecurityPolicyRule(policy) ...but that function doesn't check whether the passed policy is a cluster policy, and when the matchExpressions is empty, it ends up adding one namespace (whatever comes back in the k8s client response first that hasn't been added yet) to NamespaceList of all existing policies (cluster or not), it then returns true and the policy is added to the GetSecurityPolicies(..) response. It appears that over time, as matchClusterSecurityPolicyRule(..) is called, the list of NamespaceList in each policy keeps increasing, by one added ns at a time, which can explain the fact that for @gusfcarvalho it takes time to see this behavior. Should then matchClusterSecurityPolicyRule(..) check for the policy type, and also receive the namespaceName to match agaisnt the cluster policy, and fix the updating of NamespaceList? I may be missing something.

Actually, for what I can understand from CreateSecurityPolicy(..) and how it is used, the cluster policies are built already with the NamespaceList properly initialized, so the function matchClusterSecurityPolicyRule(policy) could be just removed, and let the || kl.ContainsElement(policy.Spec.Selector.NamespaceList, namespaceName) already present do the matching.

I'm trying to test that change but I'm having issues with the build using the Dockerfile: bpf.c:28:10: fatal error: asm/unistd.h: No such file or directory... Hoping the maintainers can chime in soon.

I passed the build issues and created an image with this proposed fix, and deployed it internally. I'll report back if that solves my scenario.

Prateeknandle commented 3 months ago

It looks like the bug is right here? https://github.com/kubearmor/KubeArmor/blob/7e7b1c390316970b42ddfc13b725187ef0c4b9f1/KubeArmor/core/kubeUpdate.go#L1016

It adds the policy to the return if this returns true || matchClusterSecurityPolicyRule(policy) ...but that function doesn't check whether the passed policy is a cluster policy, and when the matchExpressions is empty, it ends up adding one namespace (whatever comes back in the k8s client response first that hasn't been added yet) to NamespaceList of all existing policies (cluster or not), it then returns true and the policy is added to the GetSecurityPolicies(..) response. It appears that over time, as matchClusterSecurityPolicyRule(..) is called, the list of NamespaceList in each policy keeps increasing, by one added ns at a time, which can explain the fact that for @gusfcarvalho it takes time to see this behavior. Should then matchClusterSecurityPolicyRule(..) check for the policy type, and also receive the namespaceName to match agaisnt the cluster policy, and fix the updating of NamespaceList? I may be missing something.

Actually, for what I can understand from CreateSecurityPolicy(..) and how it is used, the cluster policies are built already with the NamespaceList properly initialized, so the function matchClusterSecurityPolicyRule(policy) could be just removed, and let the || kl.ContainsElement(policy.Spec.Selector.NamespaceList, namespaceName) already present do the matching.

I'm trying to test that change but I'm having issues with the build using the Dockerfile: bpf.c:28:10: fatal error: asm/unistd.h: No such file or directory... Hoping the maintainers can chime in soon.

Hey @carlosrodfern thnx for the detailed explanation here. You are correct there is bug in this check matchClusterSecurityPolicyRule bcz it is getting executed for ksp(KubeArmorSecurityPolicy) as well and it will indeed return true in the case you mentioned. But we cannot remove this check because there is no namespace watcher exists, which will update the NamespaceList when a new namespace is created after applying the cluster policy. In the case of NotIn operator it is important to update the NamespaceList if a new namespace is added/created.

Thus we will not recommend to remove the check, rather add a condition in matchClusterSecurityPolicyRule to check if the policy is of ksp type, if it is then we will return false early.

BTW on a side note I don't think this was the bug that was causing the problem for @gusfcarvalho, bcz he is using v1.3.4 which does not have these changes, these are being added later and are available in v1.4.0

carlosrodfern commented 3 months ago

It looks like the bug is right here? https://github.com/kubearmor/KubeArmor/blob/7e7b1c390316970b42ddfc13b725187ef0c4b9f1/KubeArmor/core/kubeUpdate.go#L1016

It adds the policy to the return if this returns true || matchClusterSecurityPolicyRule(policy) ...but that function doesn't check whether the passed policy is a cluster policy, and when the matchExpressions is empty, it ends up adding one namespace (whatever comes back in the k8s client response first that hasn't been added yet) to NamespaceList of all existing policies (cluster or not), it then returns true and the policy is added to the GetSecurityPolicies(..) response. It appears that over time, as matchClusterSecurityPolicyRule(..) is called, the list of NamespaceList in each policy keeps increasing, by one added ns at a time, which can explain the fact that for @gusfcarvalho it takes time to see this behavior. Should then matchClusterSecurityPolicyRule(..) check for the policy type, and also receive the namespaceName to match agaisnt the cluster policy, and fix the updating of NamespaceList? I may be missing something.

Actually, for what I can understand from CreateSecurityPolicy(..) and how it is used, the cluster policies are built already with the NamespaceList properly initialized, so the function matchClusterSecurityPolicyRule(policy) could be just removed, and let the || kl.ContainsElement(policy.Spec.Selector.NamespaceList, namespaceName) already present do the matching. I'm trying to test that change but I'm having issues with the build using the Dockerfile: bpf.c:28:10: fatal error: asm/unistd.h: No such file or directory... Hoping the maintainers can chime in soon.

Hey @carlosrodfern thnx for the detailed explanation here. You are correct there is bug in this check matchClusterSecurityPolicyRule bcz it is getting executed for ksp(KubeArmorSecurityPolicy) as well and it will indeed return true in the case you mentioned. But we cannot remove this check because there is no namespace watcher exists, which will update the NamespaceList when a new namespace is created after applying the cluster policy. In the case of NotIn operator it is important to update the NamespaceList if a new namespace is added/created.

Thus we will not recommend to remove the check, rather add a condition in matchClusterSecurityPolicyRule to check if the policy is of ksp type, if it is then we will return false early.

BTW on a side note I don't think this was the bug that was causing the problem for @gusfcarvalho, bcz he is using v1.3.4 which does not have these changes, these are being added later and are available in v1.4.0

Thank you @Prateeknandle for looking into this. I'll be creating a separate issue and correcting the PR.