kubearmor / KubeArmor

Runtime Security Enforcement System. Workload hardening/sandboxing and implementing least-permissive policies made easy leveraging LSMs (BPF-LSM, AppArmor).
Apache License 2.0
1.46k stars 338 forks source link

Talos linux support #1540

Open nyrahul opened 10 months ago

nyrahul commented 10 months ago

Feature Request

Short Description

Support for Talos Linux needs to be validated. Validation has to be done for:

As an FYI, the tasks involved would be:

  1. Setup a Talos k8s cluster
  2. Install KubeArmor
  3. Check sample workloads and verify if following things are working:
    • Policy Enforcement
    • Alerts/Telemetry
  4. Get karmor probe output reference and attach in this issue
  5. Update Kubearmor support matrix
nyrahul commented 10 months ago

Based on test cluster setup, following are the observations:

❯ kubectl get nodes -o wide
NAME                           STATUS   ROLES           AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE         KERNEL-VERSION                       CONTAINER-RUNTIME
talos-default-controlplane-1   Ready    control-plane   16h   v1.28.3      <none>        Talos (v1.5.5)   containerd://1.6.23
talos-default-worker-1         Ready    <none>          16h   v1.28.3      <none>        Talos (v1.5.5)   containerd://1.6.23

No BPF-LSM support in the default config!! Check below

❯ talosctl --nodes copy /proc/config.gz .
❯ zcat config.gz | grep -i bpf
# BPF subsystem
# CONFIG_BPF_LSM is not set
# end of BPF subsystem
# CONFIG_BPFILTER is not set


  1. bpf-lsm, apparmor, selinux ... none of the lsms seem to be enabled by default
  2. ebpf is enabled which means audit/network-segmentation will work.

Next steps?

Check with Talos team to enable BPF-LSM module in the kernel config and use the updated image. BPF-LSM is enabled on most other hardened distributions such as EKS-Bottlerocket, GKE-COS, Oracle-UEK, EKS-AmazonLinux, Azure-Mariner etc.

professorabhay commented 10 months ago

Hey, @nyrahul! I want to work on this. Could you tell me anything else that will help me to proceed?

mattiashem commented 10 months ago


Maybe you need to build your own kernel ...

SankalpDoC commented 9 months ago

Seems doable to me... Please assign this to me @nyrahul

harisudarsan1 commented 9 months ago

@nyrahul I have asked the community for BPF-LSM support. I'm interested in this task

nyrahul commented 9 months ago

Thank you very much for the interest.

As mentioned by @mattiashem , the task would involve:

  1. creating a new kernel image for Talos with BPF-LSM enabled.
  2. Using that image to setup a Talos cluster and installing KubeArmor on it.
  3. Executing KubeArmor testsuite on the target cluster.

I can put in multiple assignees since we have received multiple interests (thank you!). Please continue discussion on this thread if anything is not clear.

SankalpDoC commented 9 months ago

@nyrahul I'm fairly new to the project... And here's what I've found so far.

My Cluster Info

$ sudo kubectl get nodes -o wide
NAME                           STATUS   ROLES           AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                     KERNEL-VERSION          CONTAINER-RUNTIME
talos-default-controlplane-1   Ready    control-plane   2m35s   v1.28.3      <none>        Talos (v1.6.0-alpha.2-42-g59b62398f-dirty)   6.6.6-200.fc39.x86_64   containerd://1.7.11
talos-default-worker-1         Ready    <none>          2m35s   v1.28.3      <none>        Talos (v1.6.0-alpha.2-42-g59b62398f-dirty)   6.6.6-200.fc39.x86_64   containerd://1.7.11

Used the wordpress example as the sample workload:

[ryu@sdoc examples]$ cd wordpress-mysql/
[ryu@sdoc wordpress-mysql]$ ls
original  security-policies  wordpress-mysql-deployment.yaml
[ryu@sdoc wordpress-mysql]$ sudo kubectl apply -f .
[sudo] password for ryu: 
namespace/wordpress-mysql created
service/wordpress created
Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "wordpress" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "wordpress" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "wordpress" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "wordpress" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
deployment.apps/wordpress created
service/mysql created
Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "mysql" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "mysql" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "mysql" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "mysql" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
deployment.apps/mysql created
[ryu@sdoc wordpress-mysql]$ sudo kubectl get pod,svc -n wordpress-mysql
NAME                             READY   STATUS    RESTARTS   AGE
pod/mysql-64d8fbdf68-sqnnw       1/1     Running   0          111s
pod/wordpress-78bc585459-jc7p6   1/1     Running   0          111s

NAME                TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
service/mysql       ClusterIP    <none>        3306/TCP       111s
service/wordpress   NodePort   <none>        80:30080/TCP   111s

Following is the output for karmor probe

[ryu@sdoc KubeArmor]$ sudo karmor probe --full

Didnot find KubeArmor in systemd or Kubernetes, probing for support for KubeArmor

     Observability/Audit: Supported (Kernel Version 6.6.6)
     Enforcement: Full (Supported LSMs: lockdown,capability,yama,selinux,bpf,landlock)
W1217 23:27:19.699109   64607 warnings.go:70] would violate PodSecurity "restricted:latest": privileged (container "karmor-probe" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "karmor-probe" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "karmor-probe" must set securityContext.capabilities.drop=["ALL"]), restricted volume types (volumes "lsm-path", "lib-modules", "kernel-header" use restricted volume type "hostPath"), runAsNonRoot != true (pod or container "karmor-probe" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "karmor-probe" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

Creating probe daemonset ...
Node 1 : 
     Observability/Audit:panic: runtime error: index out of range [0] with length 0

goroutine 1 [running]:
github.com/kubearmor/kubearmor-client/probe.checkNodeKernelHeaderPresent(0xc000948820, {{0x28f1e8a, 0x9}, 0x1, {0x28e30c5, 0x4}, {0x0, 0x0}}, {0xc000a080a0, 0x1c})
    /home/runner/work/kubearmor-client/kubearmor-client/probe/probe.go:338 +0x2a6
github.com/kubearmor/kubearmor-client/probe.probeNode(0xc000948820, {{0x28f1e8a, 0x9}, 0x1, {0x28e30c5, 0x4}, {0x0, 0x0}})
    /home/runner/work/kubearmor-client/kubearmor-client/probe/probe.go:393 +0x297
github.com/kubearmor/kubearmor-client/probe.PrintProbeResult(0xc000948820, {{0x28f1e8a, 0x9}, 0x1, {0x28e30c5, 0x4}, {0x0, 0x0}})
    /home/runner/work/kubearmor-client/kubearmor-client/probe/probe.go:203 +0xf46
github.com/kubearmor/kubearmor-client/cmd.glob..func7(0xc0001d1a00?, {0x28e310d?, 0x4?, 0x28e3111?})
    /home/runner/work/kubearmor-client/kubearmor-client/cmd/probe.go:26 +0x4b
github.com/spf13/cobra.(*Command).execute(0x44982a0, {0xc0005a16b0, 0x1, 0x1})
    /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x87c
    /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5
    /home/runner/work/kubearmor-client/kubearmor-client/cmd/root.go:49 +0x1a
    /home/runner/work/kubearmor-client/kubearmor-client/main.go:10 +0xf

Here's the output for the testsuite:

[ryu@sdoc k8s_env]$ sudo make
[sudo] password for ryu: 
# run in two steps as syscall suite fails if run at the very end
# see - https://github.com/kubearmor/KubeArmor/issues/1269
Running Suite: Syscalls Suite - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls
Random Seed: 1702836956

Will run 19 of 19 specs
  > Enter [BeforeSuite] TOP-LEVEL - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:16 @ 12/17/23 23:46:01.995
  < Exit [BeforeSuite] TOP-LEVEL - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:16 @ 12/17/23 23:46:03.202 (1.207s)
[BeforeSuite] PASSED [1.207 seconds]
  Match syscalls
    can detect unlink syscall
  > Enter [BeforeEach] Syscalls - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:42 @ 12/17/23 23:46:03.202
time="2023-12-17T23:46:03+05:30" level=info msg="K8sGetPods pod=ubuntu-1-deployment- ns=syscalls ants=[kubearmor-policy: enabled] timeout=60"
  [FAILED] Expected
      <*errors.errorString | 0xc0002d3a80>: 
      pod not found
      {s: "pod not found"}
  to be nil
  In [BeforeEach] at: /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:33 @ 12/17/23 23:47:04.589
  < Exit [BeforeEach] Syscalls - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:42 @ 12/17/23 23:47:04.589 (1m1.387s)
  > Enter [AfterEach] Syscalls - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:46 @ 12/17/23 23:47:04.589
  < Exit [AfterEach] Syscalls - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:46 @ 12/17/23 23:47:04.678 (89ms)

  Attempt #1 Failed.  Retrying ↺ @ 12/17/23 23:47:04.678

  > Enter [BeforeEach] Syscalls - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:42 @ 12/17/23 23:47:04.678
time="2023-12-17T23:47:04+05:30" level=info msg="K8sGetPods pod=ubuntu-1-deployment- ns=syscalls ants=[kubearmor-policy: enabled] timeout=60"
  [FAILED] Expected
      <*errors.errorString | 0xc0003ea000>: 
      pod not found
      {s: "pod not found"}
  to be nil
  In [BeforeEach] at: /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:33 @ 12/17/23 23:48:06.023
  < Exit [BeforeEach] Syscalls - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:42 @ 12/17/23 23:48:06.024 (1m1.345s)
  > Enter [AfterEach] Syscalls - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:46 @ 12/17/23 23:48:06.024
  < Exit [AfterEach] Syscalls - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:46 @ 12/17/23 23:48:06.101 (78ms)

  Attempt #2 Failed.  Retrying ↺ @ 12/17/23 23:48:06.101

  > Enter [BeforeEach] Syscalls - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:42 @ 12/17/23 23:48:06.101
time="2023-12-17T23:48:06+05:30" level=info msg="K8sGetPods pod=ubuntu-1-deployment- ns=syscalls ants=[kubearmor-policy: enabled] timeout=60"
  [TIMEDOUT] A suite timeout occurred
  In [BeforeEach] at: /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:42 @ 12/17/23 23:48:56.251

  This is the Progress Report generated when the suite timeout occurred:
    Syscalls Match syscalls can detect unlink syscall (Spec Runtime: 2m53.05s)
      In [BeforeEach] (Node Runtime: 50.15s)

      Spec Goroutine
      goroutine 212 [sleep]
        github.com/kubearmor/KubeArmor/tests/util.K8sGetPods({0x2f9a116, 0x14}, {0x2f83eb8, 0x8}, {0xc0003ea590, 0x1, 0x1}, 0x3c)
      > github.com/kubearmor/KubeArmor/tests/k8s_env/syscalls.getUbuntuPod({0x2f9a116, 0x14}, {0x2fa437a, 0x19})
            | func getUbuntuPod(name string, ant string) string {
            >   pods, err := K8sGetPods(name, "syscalls", []string{ant}, 60)
            |   Expect(err).To(BeNil())
            |   Expect(len(pods)).To(Equal(1))
      > github.com/kubearmor/KubeArmor/tests/k8s_env/syscalls.glob..func3.1()
            | BeforeEach(func() {
            >   ubuntu = getUbuntuPod("ubuntu-1-deployment-", "kubearmor-policy: enabled")
            | })
        github.com/onsi/ginkgo/v2/internal.extractBodyFunction.func3({0x1, 0x0})
        github.com/onsi/ginkgo/v2/internal.(*Suite).runNode in goroutine 9
  < Exit [BeforeEach] Syscalls - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:42 @ 12/17/23 23:48:56.253 (50.152s)
  > Enter [AfterEach] Syscalls - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:46 @ 12/17/23 23:48:56.253
  < Exit [AfterEach] Syscalls - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:46 @ 12/17/23 23:48:56.39 (136ms)
• [TIMEDOUT] [173.188 seconds]
Syscalls [BeforeEach]
  Match syscalls
    can detect unlink syscall
  Match syscalls
    can detect unlink syscall from dir source
  Match syscalls
    can detect unlink syscall from recursive dir source
  Match syscalls
    can detect unlink syscall from path source
  Match paths
    can detect unlink syscall recursive target
  Match paths
    can detect unlink syscall targets absolute file path
  Match paths
    can detect unlink syscall recursive target from absolute path
  Match paths
    can detect unlink syscall recursive target from recursive dir
  Match paths
    can detect unlink syscall recursive target from dir
  Policy informations for matchsyscalls
    can detect unlink syscall recursive target with global informations
  Policy informations for matchsyscalls
    can detect unlink syscall recursive target with local informations
  Policy informations for matchsyscalls
    can detect unlink syscall recursive target with local informations when global is set
  Policy informations for matchsyscalls
    can detect unlink syscall recursive target with missing local informations when global is set
  Policy informations for matchpaths
    can detect unlink syscall recursive target with global informations
  Policy informations for matchpaths
    can detect unlink syscall recursive target with local informations
  Policy informations for matchpaths
    can detect unlink syscall recursive target with local informations when global is set
  Policy informations for matchpaths
    can detect unlink syscall recursive target with missing local informations when global is set
  Policy informations for matchpaths
    mount will be blocked by default for a pod
  Policy informations for matchpaths
    umount will be blocked by default for a pod as the capability not added
  > Enter [AfterSuite] TOP-LEVEL - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:25 @ 12/17/23 23:48:56.39
  [TIMEDOUT] A grace period timeout occurred
  In [AfterSuite] at: /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:25 @ 12/17/23 23:49:26.391

  This is the Progress Report generated when the grace period timeout occurred:
    In [AfterSuite] (Node Runtime: 30s)

    Spec Goroutine
    goroutine 212 [chan send]
      panic({0x2def640?, 0xc000594d20?})
      github.com/onsi/ginkgo/v2.Fail({0xc000486310, 0x68}, {0xc000753718?, 0xc000486310?, 0x0?})
      github.com/onsi/gomega/internal.(*Assertion).match(0xc000492ec0, {0x32e45a8, 0x464bb60}, 0x1, {0x0, 0x0, 0x0})
      github.com/onsi/gomega/internal.(*Assertion).To(0xc000492ec0, {0x32e45a8, 0x464bb60}, {0x0, 0x0, 0x0})
    > github.com/kubearmor/KubeArmor/tests/k8s_env/syscalls.getUbuntuPod({0x2f9a116, 0x14}, {0x2fa437a, 0x19})
          | func getUbuntuPod(name string, ant string) string {
          |     pods, err := K8sGetPods(name, "syscalls", []string{ant}, 60)
          >     Expect(err).To(BeNil())
          |     Expect(len(pods)).To(Equal(1))
          |     return pods[0]
    > github.com/kubearmor/KubeArmor/tests/k8s_env/syscalls.glob..func3.1()
          | BeforeEach(func() {
          >     ubuntu = getUbuntuPod("ubuntu-1-deployment-", "kubearmor-policy: enabled")
          | })
      github.com/onsi/ginkgo/v2/internal.extractBodyFunction.func3({0x1, 0x0})
      github.com/onsi/ginkgo/v2/internal.(*Suite).runNode in goroutine 9

    Goroutines of Interest
    goroutine 260 [syscall]
      syscall.Syscall6(0xc000343a20?, 0xc0005aa5a0?, 0xc000617ce6?, 0xc000617e08?, 0x1191304?, 0xc000529518?, 0x1?)
      github.com/kubearmor/KubeArmor/tests/util.Kubectl({0xc00078bad0?, 0xc?})
      github.com/kubearmor/KubeArmor/tests/util.K8sDelete({0xc000617f58?, 0x1, 0xc000617f01?})
    > github.com/kubearmor/KubeArmor/tests/k8s_env/syscalls.glob..func2()
          | var _ = AfterSuite(func() {
          |     // delete wordpress-mysql app from syscalls ns
          >     err := K8sDelete([]string{"manifests/ubuntu-deployment.yaml"})
          |     Expect(err).To(BeNil())
          | })
      github.com/onsi/ginkgo/v2/internal.extractBodyFunction.func3({0x1, 0x0})
      github.com/onsi/ginkgo/v2/internal.(*Suite).runNode in goroutine 9
  < Exit [AfterSuite] TOP-LEVEL - /home/ryu/Desktop/Hobbies/KubeArmor/KubeArmor/tests/k8s_env/syscalls/syscalls_test.go:25 @ 12/17/23 23:49:26.393 (30.002s)
[AfterSuite] [TIMEDOUT] [30.002 seconds]

Summarizing 2 Failures:
  [TIMEDOUT] Syscalls [BeforeEach] Match syscalls can detect unlink syscall
  [TIMEDOUT] [AfterSuite] 

Ran 1 of 19 Specs in 204.398 seconds
FAIL! - Suite Timeout Elapsed -- 0 Passed | 1 Failed | 0 Pending | 18 Skipped
--- FAIL: TestSyscalls (204.40s)

Ginkgo ran 1 suite in 3m30.232402109s

Test Suite Failed
make: *** [Makefile:9: build] Error 1

I don't understand everything I've done here completely so any feedbacks/suggestions would be highly appreciated!

ChucklesDroid commented 8 months ago

Hi, So I'll be taking this issue up

ChucklesDroid commented 8 months ago

can you also assign me this issue @nyrahul , Thanks!

SD-13 commented 8 months ago

I would like to work on this with @SankalpDoC

SD-13 commented 8 months ago

@nyrahul I can start with building the image with the BPF-LSM enabled.

SankalpDoC commented 7 months ago

Can anyone help me with this?? Is it a version specific issue or anything familiar...

$ sudo karmor install
😄      Auto Detected Environment : generic                                               
🔥      CRD kubearmorpolicies.security.kubearmor.com                                      
ℹ️       CRD kubearmorpolicies.security.kubearmor.com already exists                       
🔥      CRD kubearmorhostpolicies.security.kubearmor.com                                  
ℹ️       CRD kubearmorhostpolicies.security.kubearmor.com already exists                   
💫      Service Account                                                                   
ℹ️       Service Account already exists                                                    
⚙️       Cluster Role                                                                      
ℹ️       Cluster Role already exists                                                       
⚙️       Cluster Role Bindings                                                             
ℹ️       Cluster Role Bindings already exists                                              
🛡       KubeArmor Relay Service                                                           
ℹ️       KubeArmor Relay Service already exists                                            
🛰       KubeArmor Relay Deployment                                                        
W0220 13:59:17.894966   30438 warnings.go:70] would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "kubearmor-relay-server" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "kubearmor-relay-server" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "kubearmor-relay-server" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "kubearmor-relay-server" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
ℹ️       KubeArmor Relay Deployment already exists                                         
🛡       KubeArmor DaemonSet - Init kubearmor/kubearmor-init:stable, Container kubearmor/kubearmor:stable  -gRPC=32767  
W0220 13:59:17.902360   30438 warnings.go:70] would violate PodSecurity "restricted:latest": forbidden AppArmor profile (container.apparmor.security.beta.kubernetes.io/kubearmor="unconfined"), host namespaces (hostNetwork=true, hostPID=true), allowPrivilegeEscalation != false (containers "init", "kubearmor" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (containers "init", "kubearmor" must not include "CAP_DAC_OVERRIDE", "CAP_DAC_READ_SEARCH", "IPC_LOCK", "MAC_ADMIN", "SETGID", "SETPCAP", "SETUID", "SYS_ADMIN", "SYS_PTRACE", "SYS_RESOURCE" in securityContext.capabilities.add), restricted volume types (volumes "lib-modules-path", "sys-fs-bpf-path", "sys-kernel-security-path", "sys-kernel-debug-path", "os-release-path", "usr-src-path", "etc-apparmor-d-path", "containerd-sock-path" use restricted volume type "hostPath"), runAsNonRoot != true (pod or containers "init", "kubearmor" must set securityContext.runAsNonRoot=true), seccompProfile (pod or containers "init", "kubearmor" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
ℹ️       KubeArmor DaemonSet already exists                                                
🛡       KubeArmor Controller TLS certificates                                             
ℹ️       KubeArmor Controller TLS certificates already exists                              
💫      KubeArmor Controller Service Account                                              
ℹ️       KubeArmor Controller Service Account already exists                               
⚙️       KubeArmor Controller Roles                                                        
🚀      KubeArmor Controller Deployment                                                   
W0220 13:59:20.508966   30438 warnings.go:70] would violate PodSecurity "restricted:latest": forbidden AppArmor profiles (container.apparmor.security.beta.kubernetes.io/kube-rbac-proxy="unconfined", container.apparmor.security.beta.kubernetes.io/manager="unconfined"), allowPrivilegeEscalation != false (container "kube-rbac-proxy" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (containers "kube-rbac-proxy", "manager" must set securityContext.capabilities.drop=["ALL"]), restricted volume types (volume "sys-path" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or containers "kube-rbac-proxy", "manager" must set securityContext.runAsNonRoot=true), seccompProfile (pod or containers "kube-rbac-proxy", "manager" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
ℹ️       KubeArmor Controller Deployment already exists                                    
🚀      KubeArmor Controller Metrics Service                                              
ℹ️       KubeArmor Controller Metrics Service already exists                               
🚀      KubeArmor Controller Webhook Service                                              
ℹ️       KubeArmor Controller Webhook Service already exists                               
🤩      KubeArmor Controller Mutation Admission Registration                              
ℹ️       KubeArmor Controller Mutation Admission Registration already exists               
🚀      KubeArmor ConfigMap Creation                                                      
🥳      Done Installing KubeArmor                                                         
ℹ️       KubeArmor ConfigMap already exists                                                
🥳      Done Installing KubeArmor                                                         
😋   Checking if KubeArmor pods are running...
🥳   Done Checking , ALL Services are running!             
⌚️  Execution Time : 205.328256ms 

🔧   Verifying KubeArmor functionality (this may take upto a minute)...panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x17e7d03]

goroutine 1 [running]:
github.com/kubearmor/kubearmor-client/probe.readDataFromKubeArmor(0xc0004aaa00, {{0x28f3eb1, 0x9}, _, {_, _}, {_, _}}, {0xc00090b160, 0x1c})
    /home/runner/work/kubearmor-client/kubearmor-client/probe/probe.go:520 +0x8c3
github.com/kubearmor/kubearmor-client/probe.ProbeRunningKubeArmorNodes(0xc0004aaa00, {{0x28f3eb1, 0x9}, 0x0, {0x0, 0x0}, {0x0, 0x0}})
    /home/runner/work/kubearmor-client/kubearmor-client/probe/probe.go:502 +0x315
github.com/kubearmor/kubearmor-client/install.checkPods(0xc0004aaa00, {{0x28f3eb1, 0x9}, {0x292a80b, 0x1f}, {0x291cce8, 0x1a}, {0x293ccaa, 0x25}, {0x2943124, ...}, ...})
    /home/runner/work/kubearmor-client/kubearmor-client/install/install.go:162 +0x505
github.com/kubearmor/kubearmor-client/install.K8sInstaller(0xc0004aaa00, {{0x28f3eb1, 0x9}, {0x292a80b, 0x1f}, {0x291cce8, 0x1a}, {0x293ccaa, 0x25}, {0x2943124, ...}, ...})
    /home/runner/work/kubearmor-client/kubearmor-client/install/install.go:634 +0x6825
github.com/kubearmor/kubearmor-client/cmd.glob..func1(0xc0001d1200?, {0x28e512d?, 0x4?, 0x28e5131?})
    /home/runner/work/kubearmor-client/kubearmor-client/cmd/install.go:24 +0xf8
github.com/spf13/cobra.(*Command).execute(0x449ad20, {0x4501d80, 0x0, 0x0})
    /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x87c
    /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5
    /home/runner/work/kubearmor-client/kubearmor-client/cmd/root.go:49 +0x1a
    /home/runner/work/kubearmor-client/kubearmor-client/main.go:10 +0xf

Though karmor probe is working fine:

$ sudo karmor probe

Found KubeArmor running in Systemd mode 

Host : 
    OS Image:                   Ubuntu 22.04.4 LTS                  
    Kernel Version:             6.5.0-18-generic                    
    Kubelet Version:                                                
    Container Runtime:                                              
    Active LSM:                 BPFLSM                              
    Host Security:              true                                
    Container Security:         true                                
    Container Default Posture:  audit(File)                         audit(Capabilities) audit(Network)  
    Host Default Posture:       audit(File)                         audit(Capabilities) audit(Network)  
    Host Visibility:            process,file,network,capabilities   
Armored Up Containers : 
|        CONTAINER NAME        | POLICY |
| kind-worker2                 |        |
| kind-control-plane           |        |
| kind-worker                  |        |
| talos-default-worker-1       |        |
| talos-default-controlplane-1 |        |

But still karmor says kubearmor's not running...

$ sudo karmor version
karmor version 1.1.0 linux/amd64 BuildDate=2024-02-07T08:37:18Z
current version is the latest
kubearmor not running
ChucklesDroid commented 7 months ago

So to go over things this is a 2 fold issue: 1) Having a Talos OS kernel with LSM enabled. 2) Installing KubeArmor on it.

To go over 1, we to enable flags related to bpflsm and apparmor or selinux (so 2 seperate images will be required), edit the following file to to enable the required flags: https://github.com/siderolabs/pkgs/blob/main/kernel/build/config-amd64#L5221

NOTE: only build the kernel image for your system's architecture otherwise more often than unless your system's powerful enough, its gonna crash.(test machine used for this build was i7 12th gen and even that wasnt sufficiently powerful enough)

Next to run the cluster with updated kernel, you need to virtual box (no other option is supported for this)

Steps for creating the Talos cluster (1 node cluster is mentioned here but 3 node HA cluster should be ideally created):

a) Generating config: 1) talosctl gen config talos-vbox https://:6443 -o vbox-configs/ Kubernetes recommends tainting controlplane nodes not to run workloads(This is important for upgrading the kernel): https://www.talos.dev/v1.6/talos-guides/howto/workers-on-controlplane/

b) Creating master nodes: 1) talosctl apply-config -e -n -f vbox-configs/init.yaml --insecure

b) Creating worker nodes: 1) talosctl apply-config -e -n -f vbox-configs/join.yaml --insecure

c) Registering master node: 1) export TALOSCONFIG="vbox-configs/talosconfig" 2) talosctl config endpoints 3) talosctl config nodes 4) talosctl dmesg [OPTIONAL] [PRINT MESSAGE]

d) Setting up k8s 1) talosctl kubeconfig . -f 2) export KUBECONFIG=kubeconfig 3) kubectl get nodes [OPTIONAL] [PRINT MESSAGE]

Upgrading Talos OS kernel [on using HA cluster remove preserve flag]:

talosctl upgrade --nodes --image --preserve=true

For any help siderolabs support channel can be reached out.

To go over 2, to install KubeArmor on Talos: @rootxrishabh got it to work on his end. He can provide further details on this

rootxrishabh commented 7 months ago

@ChucklesDroid thank you for the insights, Installing kubearmor goes something like this (already mentioned this earlier but let's preserve it here) -

Docker is the default provisioner used by Talos. When running on docker, talos spins up container which uses the host kernel and thus in any BPF-LSM/AppArmor-based kernel kubearmor should support talos by default. However, we faced some errors installing kubearmor the first-time.

It was found that talos uses a Pod Security Admission controller which is not configured by default on a k8s cluster but comes pre-configured in talos. The admission controller runs a restricted access level across all namespace except kube-system which disallowed kubearmor pods to spun up.

As the admission controller is namespaced, the solution was to reconfigure the admission controller using the talos machine-configuration to exempt the kubearmor namespace to operate on a privileged level.

Fix: Pod Security Admission controller runs on all talos provisioners, the current fix was implemented manually, we have to integrate the talos API to auto configure the machine-config every time kubearmor installation detects a talos linux environment.