linkerd / linkerd2

Ultralight, security-first service mesh for Kubernetes. Main repo for Linkerd 2.x.
https://linkerd.io
Apache License 2.0
10.62k stars 1.27k forks source link

iptables Chain already exists #7767

Closed dghubble closed 1 year ago

dghubble commented 2 years ago

What is the issue?

Following the getting started guide, linkerd install Pods CrashLoop on init.

linkerd check -pre         // passes
linkerd install | kubectl apply -f -
kubectl get pods -n linkerd
NAME                                      READY   STATUS                  RESTARTS      AGE
linkerd-destination-79c6f4df89-zg88w      0/4     Init:CrashLoopBackOff   5 (58s ago)   3m51s
linkerd-identity-748c8f74f-gpzb8          0/2     Init:CrashLoopBackOff   5 (48s ago)   3m53s
linkerd-proxy-injector-5bfb9594f8-ntmbz   0/2     Init:CrashLoopBackOff   5 (60s ago)   3m51s

The init error in logs is related to iptables setup (shown below).

How can it be reproduced?

Create a fresh Kubernetes v1.23 cluster with Fedora CoreOS. I've not evaluated linkerd in years, so can't comment on whether this is new or not.

linkerd install | kubectl apply -f -

Logs, error output, etc

time="2022-02-02T19:31:34Z" level=info msg="iptables-save -t nat"
time="2022-02-02T19:31:34Z" level=info msg="# Generated by iptables-save v1.8.7 on Wed Feb  2 19:31:34 2022\n*nat\n:PREROUTING ACCEPT [0:0]\n:INPUT ACCEPT [0:0]\n:OUTPUT ACCEPT [0:0]\n:POSTROUTING ACCEPT [0:0]\n:PROXY_INIT_REDIRECT - [0:0]\nCOMMIT\n# Completed on Wed Feb  2 19:31:34 2022\n"
time="2022-02-02T19:31:34Z" level=info msg="Will ignore port [4190 4191 4567 4568] on chain PROXY_INIT_REDIRECT"
time="2022-02-02T19:31:34Z" level=info msg="Will redirect all INPUT ports to proxy"
time="2022-02-02T19:31:34Z" level=info msg="Ignoring uid 2102"
time="2022-02-02T19:31:34Z" level=info msg="Will ignore port [443] on chain PROXY_INIT_OUTPUT"
time="2022-02-02T19:31:34Z" level=info msg="Redirecting all OUTPUT to 4140"
time="2022-02-02T19:31:34Z" level=info msg="iptables -t nat -N PROXY_INIT_REDIRECT -m comment --comment proxy-init/redirect-common-chain/1643830294"
time="2022-02-02T19:31:34Z" level=info msg="iptables: Chain already exists.\n"
time="2022-02-02T19:31:34Z" level=error msg="Aborting firewall configuration"
Error: exit status 1
Usage:
  proxy-init [flags]

Flags:
....

However, the error message is misleading. linkerd's view of the world seems to be wrong. Going on the corresponding host and listing chains, there is no PROXY_INIT_REDIRECT.

sudo iptables -t nat -L | grep PROXY   // nothing
sudo iptables -t nat N PROXY_INIT_REDIRECT   // succeeds
sudo iptables -t nat -X PROXY_INIT_REDIRECT  // does not change init container error

I can run the command on that host fine.

sudo iptables -t nat -N PROXY_INIT_REDIRECT -m comment --comment proxy-init/redirect-common-chain/1643830294

output of linkerd check -o short

linkerd check -o short
Linkerd core checks
===================

linkerd-existence
-----------------
| No running pods for "linkerd-destination"

Environment

Possible solution

Likely, linkerd has some host OS assumption I've missed and that's not present in the pre checks.

Additional context

Notably, I've found that the host OS is significant. Choosing Flatcar Linux, init works ok. With Fedora CoreOS it does not. While similar to https://github.com/linkerd/linkerd2/issues/7749, I've filed this separately because Fedora CoreOS is not using nft yet.

Would you like to work on fixing this bug?

No response

olix0r commented 2 years ago

Thanks for the writeup @dghubble!

mateiidavid commented 2 years ago

Hey @dghubble, the detailed write-up is super appreciated! Unfortunately, it's not very easy for me to spin up a CoreOS cluster. I'd be interested to see what's being logged when the level is set to debug.

If you wouldn't mind giving it a try, the set-up would be fairly easy, either inject a pod manually and edit the manifest to set the log level on the init container, or re-install linkerd with --set proxyInit.logLevel=debug [ref]

We're actually checking for the existence of these chains prior to configuring them: https://github.com/linkerd/linkerd2-proxy-init/blob/master/iptables/iptables.go#L67-L80. This doesn't seem to come from our check, but from the stdout of the iptables command? More log records will hopefully demistify what's going on there.

dghubble commented 2 years ago

With debug, only one log new log line is present. Separately, I suspected SELinux enforcement was a factor, so I disabled that for now and linkerd-init seems to complete.

time="2022-02-03T21:30:23Z" level=debug msg="Tracing this script execution as [1643923823]"
time="2022-02-03T21:30:23Z" level=info msg="iptables-save -t nat"
time="2022-02-03T21:30:23Z" level=info msg="# Generated by iptables-save v1.8.7 on Thu Feb  3 21:30:23 2022\n*nat\n:PREROUTING ACCEPT [0:0]\n:INPUT ACCEPT [0:0]\n:OUTPUT ACCEPT [0:0]\n:POSTROUTING ACCEPT [0:0]\nCOMMIT\n# Completed on Thu Feb  3 21:30:23 2022\n"
time="2022-02-03T21:30:23Z" level=info msg="Will ignore port [4190 4191 4567 4568] on chain PROXY_INIT_REDIRECT"
time="2022-02-03T21:30:23Z" level=info msg="Will redirect all INPUT ports to proxy"
time="2022-02-03T21:30:23Z" level=info msg="Ignoring uid 2102"
time="2022-02-03T21:30:23Z" level=info msg="Will ignore port [4567 4568] on chain PROXY_INIT_OUTPUT"
time="2022-02-03T21:30:23Z" level=info msg="Redirecting all OUTPUT to 4140"
time="2022-02-03T21:30:23Z" level=info msg="iptables -t nat -N PROXY_INIT_REDIRECT -m comment --comment proxy-init/redirect-common-chain/1643923823"
time="2022-02-03T21:30:23Z" level=info msg="iptables -t nat -A PROXY_INIT_REDIRECT -p tcp --match multiport --dports 4190,4191,4567,4568 -j RETURN -m comment --comment proxy-init/ignore-port-4190,4191,4567,4568/1643923823"
time="2022-02-03T21:30:23Z" level=info msg="iptables -t nat -A PROXY_INIT_REDIRECT -p tcp -j REDIRECT --to-port 4143 -m comment --comment proxy-init/redirect-all-incoming-to-proxy-port/1643923823"
time="2022-02-03T21:30:23Z" level=info msg="iptables -t nat -A PREROUTING -j PROXY_INIT_REDIRECT -m comment --comment proxy-init/install-proxy-init-prerouting/1643923823"
time="2022-02-03T21:30:23Z" level=info msg="iptables -t nat -N PROXY_INIT_OUTPUT -m comment --comment proxy-init/redirect-common-chain/1643923823"
time="2022-02-03T21:30:23Z" level=info msg="iptables -t nat -A PROXY_INIT_OUTPUT -m owner --uid-owner 2102 -j RETURN -m comment --comment proxy-init/ignore-proxy-user-id/1643923823"
time="2022-02-03T21:30:23Z" level=info msg="iptables -t nat -A PROXY_INIT_OUTPUT -o lo -j RETURN -m comment --comment proxy-init/ignore-loopback/1643923823"
time="2022-02-03T21:30:23Z" level=info msg="iptables -t nat -A PROXY_INIT_OUTPUT -p tcp --match multiport --dports 4567,4568 -j RETURN -m comment --comment proxy-init/ignore-port-4567,4568/1643923823"
time="2022-02-03T21:30:23Z" level=info msg="iptables -t nat -A PROXY_INIT_OUTPUT -p tcp -j REDIRECT --to-port 4140 -m comment --comment proxy-init/redirect-all-outgoing-to-proxy-port/1643923823"
time="2022-02-03T21:30:23Z" level=info msg="iptables -t nat -A OUTPUT -j PROXY_INIT_OUTPUT -m comment --comment proxy-init/install-proxy-init-output/1643923823"
time="2022-02-03T21:30:23Z" level=info msg="iptables-save -t nat"
time="2022-02-03T21:30:23Z" level=info msg="# Generated by iptables-save v1.8.7 on Thu Feb  3 21:30:23 2022\n*nat\n:PREROUTING ACCEPT [0:0]\n:INPUT ACCEPT [0:0]\n:OUTPUT ACCEPT [0:0]\n:POSTROUTING ACCEPT [0:0]\n:PROXY_INIT_OUTPUT - [0:0]\n:PROXY_INIT_REDIRECT - [0:0]\n-A PREROUTING -m comment --comment \"proxy-init/install-proxy-init-prerouting/1643923823\" -j PROXY_INIT_REDIRECT\n-A OUTPUT -m comment --comment \"proxy-init/install-proxy-init-output/1643923823\" -j PROXY_INIT_OUTPUT\n-A PROXY_INIT_OUTPUT -m owner --uid-owner 2102 -m comment --comment \"proxy-init/ignore-proxy-user-id/1643923823\" -j RETURN\n-A PROXY_INIT_OUTPUT -o lo -m comment --comment \"proxy-init/ignore-loopback/1643923823\" -j RETURN\n-A PROXY_INIT_OUTPUT -p tcp -m multiport --dports 4567,4568 -m comment --comment \"proxy-init/ignore-port-4567,4568/1643923823\" -j RETURN\n-A PROXY_INIT_OUTPUT -p tcp -m comment --comment \"proxy-init/redirect-all-outgoing-to-proxy-port/1643923823\" -j REDIRECT --to-ports 4140\n-A PROXY_INIT_REDIRECT -p tcp -m multiport --dports 4190,4191,4567,4568 -m comment --comment \"proxy-init/ignore-port-4190,4191,4567,4568/1643923823\" -j RETURN\n-A PROXY_INIT_REDIRECT -p tcp -m comment --comment \"proxy-init/redirect-all-incoming-to-proxy-port/1643923823\" -j REDIRECT --to-ports 4143\nCOMMIT\n# Completed on Thu Feb  3 21:30:23 2022\n"

However, the containers in the linkerd pods still wait on the init container oddly.

kubectl get pod linkerd-proxy-injector-56d6648965-rfhf6 -n linkerd
NAME                                      READY   STATUS            RESTARTS   AGE
linkerd-proxy-injector-56d6648965-rfhf6   0/2     PodInitializing   0          9m49s
Init Containers:                                                                                                                                              
  linkerd-init:                                                                                                                                               
    Image:         cr.l5d.io/linkerd/proxy-init:v1.5.2                                                                                                        
    ...                                                                                                                   
    State:          Terminated                                                                                                                                
      Reason:       Completed                                                                                                                                 
      Exit Code:    0                                                                                                                                                                                                                                       
    Ready:          True                                                                                                                                      
    Restart Count:  0
...
Containers:                                                                                                                                                   
  linkerd-proxy:
    Container ID:   
    Image:          cr.l5d.io/linkerd/proxy:edge-22.1.5
    ...
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0

For the SELinux issue, you might not need a full Kubernetes cluster. A VM with a distro that enforces SELinux and run the linkerd-init container similarly might repro.

frdeng commented 2 years ago

Ran into the same issue here with SELinux enabled.

wondersd commented 2 years ago

I ran into this issue as well on coreos/typhoon and with some modifications was able to get it working.

Kubernetes v1.22.4 Distro: Typhoon OS: Fedora CoreOS (36) CRI: docker CNI: calico Linkerd: stable-2.11.2

Init Containers:
  linkerd-init:
    Container ID:  docker://101a5f1875f85b6ef9f9c7eafd3eda706bf9bbd437c583367e3a09b2fa5b13db
    Image:         cr.l5d.io/linkerd/proxy-init:v1.5.3
    ...
    State:       Waiting
      Reason:    CrashLoopBackOff
    Last State:  Terminated
      Reason:    Error
      Message:   {"level":"info","msg":"iptables-save -t nat","time":"2022-06-05T20:34:09Z"}
{"level":"info","msg":"iptables-save v1.8.7 (legacy): Cannot initialize: Permission denied\n\n","time":"2022-06-05T20:34:09Z"}
{"level":"error","msg":"aborting firewall configuration","time":"2022-06-05T20:34:09Z"}
Error: exit status 1

Unfortunately, didnt get a capture of the rest of the init container logs for the above run.

I was however able to get the linkerd control plane to run by hand modifying the seLinuxOptions block for the linkerd-init container.

...
    initContainers:
      ...
        image: cr.l5d.io/linkerd/proxy-init:v1.5.3
        imagePullPolicy: IfNotPresent
        name: linkerd-init
        ...
        securityContext:
          allowPrivilegeEscalation: true
          capabilities:
            add:
            - NET_ADMIN
            - NET_RAW
          privileged: false
          readOnlyRootFilesystem: true
          runAsNonRoot: false
          runAsUser: 0
          seLinuxOptions:
            # this effectively disables seLinux for the container
            # there's probably a more target way to provide the requisite permissions
            # but i dont know what that is
            type: spc_t

Curiously, I only modified the linkerd-destination deployment and afterwards the rest of the control plane components were unblocked and able to run.

Similarly, injected linkerd-init/linkerd-proxy containers in other namespaces also functioned without issue.

It seems that theses ip table rules are only needed once per host and once setup the linkerd-init container has the required permissions (atleast in this setup) to assert that these iptable rules exist and need no modifications.

Not sure how this situation changes with the cni plugin deployment model but would appear for linkerd-init container model to work on seLinux (and possibly AppArmor or other additional security layers) enabled environments there would need to be the ability extend the linkerd-init containers securityContext to include extra permissions required for the given nodes the workload may run on.

This should be easy enough to do for the control plane installation (helm charts or cli installation) but might be tricky to extend the annotation based configurations https://linkerd.io/2.11/reference/proxy-configuration/ for linkerd-init containers to allow for the variety of possible options.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

mateiidavid commented 1 year ago

We've introduced a new flag to run proxy-init as privileged in #9873. Running as privileged (together with root access) means essentially that proxy-init will have the same privileges as a process running root on the host. While this is not ideal from a security perspective, it should allow iptables to function properly even in restricted environments (e.g a distribution running SELinux/AppArmor).

If people continue experiencing issues, let us know and we'll look into it. Due to a lack of resources to get a reproducible environment, we rely on the community to test these features out.