jodevsa / wireguard-operator

Painless deployment of wireguard on kubernetes
MIT License
658 stars 41 forks source link

Does not work with baseline pod security standard #170

Open uhthomas opened 6 months ago

uhthomas commented 6 months ago

Describe the bug

❯ k describe rs
Events:
  Type     Reason        Age                 From                   Message
  ----     ------        ----                ----                   -------
  Warning  FailedCreate  109s                replicaset-controller  Error creating: pods "media-dep-878876c8d-vxz94" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
  Warning  FailedCreate  109s                replicaset-controller  Error creating: pods "media-dep-878876c8d-xz8fh" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
  Warning  FailedCreate  109s                replicaset-controller  Error creating: pods "media-dep-878876c8d-85956" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
  Warning  FailedCreate  109s                replicaset-controller  Error creating: pods "media-dep-878876c8d-bh8p7" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
  Warning  FailedCreate  109s                replicaset-controller  Error creating: pods "media-dep-878876c8d-ln28h" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
  Warning  FailedCreate  109s                replicaset-controller  Error creating: pods "media-dep-878876c8d-wjsrs" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
  Warning  FailedCreate  109s                replicaset-controller  Error creating: pods "media-dep-878876c8d-psmgq" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
  Warning  FailedCreate  109s                replicaset-controller  Error creating: pods "media-dep-878876c8d-ctlb4" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
  Warning  FailedCreate  108s                replicaset-controller  Error creating: pods "media-dep-878876c8d-qwstr" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
  Warning  FailedCreate  27s (x6 over 107s)  replicaset-controller  (combined from similar events): Error creating: pods "media-dep-878876c8d-fvh5h" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)

To Reproduce

Run a Kubernetes cluster with the baseline pod security standard (e.g Talos).

https://kubernetes.io/docs/concepts/security/pod-security-admission/

Expected behavior

Optionally use the userspace wireguard implementation.

Screenshots

N/A

Additional context

uhthomas commented 6 months ago

Maybe the operator could remove the privileged security context if the user space implementation is being used?

matrix-root commented 5 months ago

Did you got any success running it atop of Talos?

I've added pod-security.kubernetes.io/enforce: privileged label to namespace - do you think it's safe and enough?

uhthomas commented 5 months ago

Did you got any success running it atop of Talos?

I've added pod-security.kubernetes.io/enforce: privileged label to namespace - do you think it's safe and enough?

I use Talos, and it works but it does need that label. A lot of projects need it unfortunately.

Twi commented 5 months ago

I ended up using this magic incantation to fix wireguard on Talos:

apiVersion: v1
kind: Namespace
metadata:
  name: wireguard
  labels:
    pod-security.kubernetes.io/audit: privileged
    pod-security.kubernetes.io/audit-version: latest
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/warn: privileged
    pod-security.kubernetes.io/warn-version: latest
uhthomas commented 5 months ago

@Twi The only label which should be necessary is pod-security.kubernetes.io/enforce: privileged. The logs may complain without some of those other labels, but it will work.

jodevsa commented 5 months ago

Can this change be added to the project? I've never used tailos so I cannot test it :(. I'd really appreciate if you can add it!

jodevsa commented 5 months ago

Optionally use the userspace wireguard implementation.

I'm wondering if there is a way to detect that we are running on tailsos and we cannot run the kernal mode wireguard?

uhthomas commented 5 months ago

It would be a nice feature to have, though it is important to note this is not specific to Talos but any Kubernetes cluster which enforces the baseline pod security standard. There is already some fallback mechanism in place when creating the tunnel itself, but I believe the operator will need to also make changes to the pods too.

matrix-root commented 5 months ago

If we won’t get success with user space implementation - at least we can add notice about PodSecurity into README :)

Else it could take time for other guys to discover reason of issue

uhthomas commented 5 months ago

I wonder what the right way to do this is? I guess the first step is to add some configuration option to force user space (and remove NET_ADMIN from the security capabilities). A feature could then be built on top of that which automatically detects the current pod security standard? Not sure what the right default is. User space is likely to be less efficient, but more compatible.

jodevsa commented 4 months ago

I like the multiple phases approach ^^

So there is currently a parameter in the wiregurad resource called useWgUserspaceImplementation

              useWgUserspaceImplementation:
                description: A boolean field that specifies whether to use the userspace

https://github.com/jodevsa/wireguard-operator/blob/main/config/crd/bases/vpn.wireguard-operator.io_wireguards.yaml#L72

this paremeter gets populated in the agent, which is the bootstraping software that actually runs wireguard. What is currently missing is that we need to stop populating the security capabilities if useWgUserspaceImplementation is true.

so around here: https://github.com/jodevsa/wireguard-operator/blob/main/pkg/controllers/wireguard_controller.go#L741

we need soemthing like

if m.spec.useWgUserspaceImplementation != true {
// inject the security capabilitiy
}
jodevsa commented 4 months ago

which automatically detects the current pod security standard

Any ideas on how we can detect that? is their a kubernetes configmap that can be read to know the allowed capabilities? I think that might be more straightforward than trying to run a pod with that capabilitiy and waiting to see if that fails

jodevsa commented 4 months ago

so, going back to what @uhthomas suggested, we have 2 phases to get this complete:

Phase 1: Do not use NET_ADMIN capability if wireguard.spec. useWgUserspaceImplementation is equal to true Phase 2: Detect the pod security standard and fallback to userspace implementation if we are not allowed to have NET_ADMIN capability

jodevsa commented 4 months ago

example of using the flag:


apiVersion: vpn.wireguard-operator.io/v1alpha1
kind: Wireguard
metadata:
  name: vpn
spec:
  useWgUserspaceImplementation: true