istio / istio

Connect, secure, control, and observe services.
https://istio.io
Apache License 2.0
35.74k stars 7.7k forks source link

istio-cni 1.19.0 breaks with multus config that's working fine with 1.16.3 #48734

Closed VivekSubr closed 4 months ago

VivekSubr commented 8 months ago

Is this the right place to submit this?

Bug Description

We install istio 1.19.0 on a kubernetes cluster, with multus being configured in istio operator as, cni: cniConfDir: /etc/cni/multus/net.d chained: false

root@aks-worker-19470471-vmss000002 [ /etc/cni/multus/net.d ]# ls YYY-istio-cni.conf ZZZ-istio-cni-kubeconfig

Attaching the config files in this folder... this config worked fine for istio 1.16.3, but fails with 1.19.0 - istio-validation containers terminate and pods crash loop.

2023-12-14T04:51:18.982206Z     info    Starting iptables validation. This check verifies that iptables rules are properly established for the network.
2023-12-14T04:51:18.982272Z     info    Listening on 127.0.0.1:15001
2023-12-14T04:51:18.982416Z     info    Listening on 127.0.0.1:15006
2023-12-14T04:51:18.982582Z     error   Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
2023-12-14T04:51:19.982851Z     error   Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
2023-12-14T04:51:20.983038Z     error   Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused

Attaching multus config and istio-cni logs - logs.zip

Attaching cni config in /etc/cni/net.dcni-config - 1_19.txt

Version

subramaniamv@dev-subramaniamv:/localdata/subramaniamv$ ./istioctl -i fed-istio version
client version: 1.18.2
control plane version: 1.19.0
data plane version: 1.16-dev (6 proxies)
subramaniamv@dev-subramaniamv:/localdata/subramaniamv$ kubectl version
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.3

Additional Information

bug-report.tar.gz

VivekSubr commented 8 months ago

Also, I note that the cni test cases in istio/cni/test/install don't have one for this use case, mutiple CNIs using multus... do you how that might look like? What are the input files in this case?

howardjohn commented 8 months ago

cc @jwendell @jacob-delgado may have experience with multus

VivekSubr commented 8 months ago

@howardjohn @jwendell @jacob-delgado ... Hi, just one more question... I don't see this 15002 port in iptables rules of istio proxy

root@aks-worker-12986694-vmss000003 [ / ]# sudo nsenter -t 36148 -n iptables-save
# Generated by iptables-save v1.8.7 on Thu Jan 18 06:49:10 2024
*nat
:PREROUTING ACCEPT [3435:206100]
:INPUT ACCEPT [3436:206160]
:OUTPUT ACCEPT [3497:224372]
:POSTROUTING ACCEPT [3500:224552]
:ISTIO_INBOUND - [0:0]
:ISTIO_IN_REDIRECT - [0:0]
:ISTIO_OUTPUT - [0:0]
:ISTIO_REDIRECT - [0:0]
-A PREROUTING -p tcp -j ISTIO_INBOUND
-A OUTPUT -p tcp -j ISTIO_OUTPUT
-A ISTIO_INBOUND -p tcp -m tcp --dport 15008 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 2379 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15020 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15021 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15090 -j RETURN
-A ISTIO_INBOUND -p tcp -j ISTIO_IN_REDIRECT
-A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-ports 15006
-A ISTIO_OUTPUT -p tcp -m tcp --dport 2379 -j RETURN
-A ISTIO_OUTPUT -p tcp -m tcp --dport 10255 -j RETURN
-A ISTIO_OUTPUT -p tcp -m tcp --dport 27017 -j RETURN
-A ISTIO_OUTPUT -p tcp -m tcp --dport 8200 -j RETURN
-A ISTIO_OUTPUT -p tcp -m tcp --dport 6379 -j RETURN
-A ISTIO_OUTPUT -s 127.0.0.6/32 -o lo -j RETURN
-A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -m owner --uid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -m owner --gid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN
-A ISTIO_OUTPUT -d 127.0.0.0/24 -j RETURN
-A ISTIO_OUTPUT -j ISTIO_REDIRECT
-A ISTIO_REDIRECT -p tcp -j REDIRECT --to-ports 15001
COMMIT
# Completed on Thu Jan 18 06:49:10 2024

Who creates the listening socket for this port? 127.0.0.6:15002. In fact in 1.16.3, I don't see the port open at all

root@aks-worker-12986694-vmss000003 [ / ]# sudo nsenter -t 36148 -n ss -tulpn
Netid   State    Recv-Q    Send-Q       Local Address:Port        Peer Address:Port   Process
tcp     LISTEN   0         4096               0.0.0.0:15090            0.0.0.0:*       users:(("envoy",pid=36148,fd=22))
tcp     LISTEN   0         4096               0.0.0.0:15090            0.0.0.0:*       users:(("envoy",pid=36148,fd=21))
tcp     LISTEN   0         4096             127.0.0.1:15000            0.0.0.0:*       users:(("envoy",pid=36148,fd=18))
tcp     LISTEN   0         4096               0.0.0.0:15001            0.0.0.0:*       users:(("envoy",pid=36148,fd=34))
tcp     LISTEN   0         4096               0.0.0.0:15001            0.0.0.0:*       users:(("envoy",pid=36148,fd=33))
tcp     LISTEN   0         4096             127.0.0.1:15004            0.0.0.0:*       users:(("pilot-agent",pid=36107,fd=15))
tcp     LISTEN   0         4096               0.0.0.0:15006            0.0.0.0:*       users:(("envoy",pid=36148,fd=36))
tcp     LISTEN   0         4096               0.0.0.0:15006            0.0.0.0:*       users:(("envoy",pid=36148,fd=35))
tcp     LISTEN   0         4096               0.0.0.0:15021            0.0.0.0:*       users:(("envoy",pid=36148,fd=24))
tcp     LISTEN   0         4096               0.0.0.0:15021            0.0.0.0:*       users:(("envoy",pid=36148,fd=23))
tcp     LISTEN   0         4096                     *:15020                  *:*       users:(("pilot-agent",pid=36107,fd=3))
howardjohn commented 8 months ago

15002 is just a bogus port to test the iptables. We dial :15002 and if the iptables is setup, it redirects to 15001 and succeds. If it fails, we know iptables is not setup.

VivekSubr commented 8 months ago

Hi @howardjohn, thanks for the reply... have one more question - so, I'm trying to debug istio-iptables for working and non-working cases, and I can see that there's an env variable that enables iptables logging:

var TraceLoggingEnabled = env.Register(
    "IPTABLES_TRACE_LOGGING",
    false,
    "When enable, all iptables actions will be logged. "+
        "This requires NET_ADMIN privilege and has noisy logs; as a result, this is intended for debugging only").Get()

In https://github.com/istio/istio/blob/master/tools/istio-iptables/pkg/log/nflog.go

Unless I'm much mistaked, this is what's used by the istio-cni binary placed in nodes by istio-cni daemonset? I've tried enabling this like,

defaultConfig:
      proxyMetadata:
        IPTABLES_TRACE_LOGGING: "true"

But I don't see the logs in dmesg on node... is there any specific way to enable and view these logs?

howardjohn commented 8 months ago

That might not work with CNI

VivekSubr commented 8 months ago

@howardjohn - oh, okay... this there any way to enable logging for the istio-cni binary?

howardjohn commented 8 months ago

I think this is from the -A ISTIO_OUTPUT -d 127.0.0.0/24 -j RETURN line. That looks like custom config?

VivekSubr commented 8 months ago

no, it's not... it's vanilla 1.16.3, and that too working case. In non working case, the pods don't come up hence can't take iptables dump, so I need to see logging from istio-cni binary somehow.

howardjohn commented 8 months ago

the logging won't work if the pod doesn't come up either. You can use an ephemeral container to connect to the pod even while it's crashing

VivekSubr commented 7 months ago

@howardjohn, hi, so the root cause of this looks to be that operator or cni is not honoring the config 'chained=false'.

In Operator, we configure,

  values:
    cni:
      image: <cna-istio-cni>
      cniConfDir: /etc/cni/multus/net.d
      chained: false

But in istio-sidecar-injector configmap, I see it's true.

      "istio_cni": {
        "chained": true,
        "enabled": true
      },

This should be false so that multus annotation is added to pods: https://github.com/istio/istio/blob/49f7ce268455196445d5b5b0a7fbbae94a5d74f8/manifests/charts/istiod-remote/files/injection-template.yaml#L50

I also tried adding it to the args of istio-cni daemonset, like so,

      containers:
      - args:
        - --log_output_level=default:info
        - --chained-cni-plugin=false

This also doesn't work. Attaching the istio mutating web hook and config map for 1.16 and 1.19 istio-inject.zip

howardjohn commented 7 months ago
  values:
    cni:
      image: <cna-istio-cni>
      cniConfDir: /etc/cni/multus/net.d
      chained: false

Note there are two configs: istiod, and the CNI itself. You are only configuring the latter here. the injection template depends on the istiod config, which is under istio_cni not cni

VivekSubr commented 7 months ago

@howardjohn @jacob-delgado

Hi, root caused it. Looks to be a misconfiguration that was working, until changes introduced by: https://github.com/istio/istio/pull/45207

Before this, istio was injecting ' k8s.v1.cni.cncf.io/networks : istio-cni', but looks like that got changed here to default/istio-cni.

With this, it's expected that if we want resource in own namespace, we need to inject the annotation manually, but we tried that and it appended rather than replacing.

k8s.v1.cni.cncf.io/networks: istio-cni, default/istio-cni

We patched it back to istio-cni and did the istio_cni config to get it working.

istio-policy-bot commented 4 months ago

🚧 This issue or pull request has been closed due to not having had activity from an Istio team member since 2024-02-05. If you feel this issue or pull request deserves attention, please reopen the issue. Please see this wiki page for more information. Thank you for your contributions.

Created by the issue and PR lifecycle manager.