weaveworks / weave

Simple, resilient multi-host containers networking and more.
https://www.weave.works
Apache License 2.0
6.62k stars 671 forks source link

NetworkPolicy only enforced in pod-to-pod traffic, not when using services #3452

Open maxbischoff opened 6 years ago

maxbischoff commented 6 years ago

What you expected to happen?

When following the kubernetes tutorial on declaring network policies, I expect wget from a unlabelled pod to the nginx service to timeout.

What happened?

When using wget on the service hostname, the nginx pod can be reached. When accessing it directly via pod IP access is blocked.

$ kubectl get pod,svc,networkpolicy -o wide
NAME                         READY   STATUS    RESTARTS   AGE   IP          NODE                 NOMINATED NODE
pod/nginx-6f858d4d45-dt6fw   1/1     Running   0          35m   10.32.0.2   thesis-test-node-0   <none>

NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE   SELECTOR
service/kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP   37m   <none>
service/nginx        ClusterIP   10.96.140.192   <none>        80/TCP    35m   run=nginx

NAME                                    POD-SELECTOR   AGE
networkpolicy.extensions/access-nginx   run=nginx      36m

$ kubectl run busybox --rm -ti --image=busybox /bin/sh
If you don't see a command prompt, try pressing enter.
/ # wget --spider -T 1 nginx
Connecting to nginx (10.96.140.192:80)
/ # wget --spider -T 1 10.32.0.2
Connecting to 10.32.0.2 (10.32.0.2:80)
wget: download timed out
/ # 

How to reproduce it?

As in the tutorial:

  1. kubectl run nginx --image=nginx --expose --port 80
  2. Store the policy:
    $ echo 'kind: NetworkPolicy
    > apiVersion: networking.k8s.io/v1
    > metadata:
    >   name: access-nginx
    > spec:
    >   podSelector:
    >     matchLabels:
    >       run: nginx
    >   ingress:
    >   - from:
    >     - podSelector:
    >         matchLabels:
    >           access: "true"' > nginx-policy.yaml
  3. And apply it: kubectl apply -f nginx-policy.yaml
  4. Finally test it:
    $ kubectl run busybox --rm -ti --image=busybox /bin/sh
    If you don't see a command prompt, try pressing enter.
    / # wget --spider -T 1 nginx
    Connecting to nginx (10.96.140.192:80)
    / # wget --spider -T 1 10.32.0.2
    Connecting to 10.32.0.2 (10.32.0.2:80)
    wget: download timed out

Anything else we need to know?

Versions:

$ kubectl exec -n kube-system weave-net-cr5ng -c weave -- /home/weave/weave --local status

        Version: 2.5.0 (up to date; next check at 2018/11/19 17:36:29)

        Service: router
       Protocol: weave 1..2
           Name: 42:21:66:1c:43:ea(thesis-test-node-0)
     Encryption: disabled
  PeerDiscovery: enabled
        Targets: 2
    Connections: 2 (1 established, 1 failed)
          Peers: 2 (with 2 established connections)
 TrustedSubnets: none

        Service: ipam
         Status: ready
          Range: 10.32.0.0/12
  DefaultSubnet: 10.32.0.0/12

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-30T21:39:16Z", GoVersion:"go1.11.1", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.3", GitCommit:"a4529464e4629c21224b3d52edfe0ea91b072862", GitTreeState:"clean", BuildDate:"2018-09-09T17:53:03Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

Logs:

$  kubectl logs -n kube-system weave-net-cr5ng weave
INFO: 2018/11/19 10:26:19.277194 Command line options: map[docker-api: port:6783 expect-npc:true nickname:thesis-test-node-0 ipalloc-range:10.32.0.0/12 name:42:21:66:1c:43:ea no-dns:true conn-limit:100 db-prefix:/weavedb/weave-net http-addr:127.0.0.1:6784 ipalloc-init:consensus=2 metrics-addr:0.0.0.0:6782 datapath:datapath host-root:/host]
INFO: 2018/11/19 10:26:19.277535 weave  2.5.0
INFO: 2018/11/19 10:26:19.413898 Bridge type is bridged_fastdp
INFO: 2018/11/19 10:26:19.413917 Communication between peers is unencrypted.
INFO: 2018/11/19 10:26:19.420441 Our name is 42:21:66:1c:43:ea(thesis-test-node-0)
INFO: 2018/11/19 10:26:19.420486 Launch detected - using supplied peer list: [172.16.0.14 172.16.0.7]
INFO: 2018/11/19 10:26:19.442706 Unable to fetch ConfigMap kube-system/weave-net to infer unique cluster ID
INFO: 2018/11/19 10:26:19.442735 Checking for pre-existing addresses on weave bridge
INFO: 2018/11/19 10:26:19.567869 [allocator 42:21:66:1c:43:ea] No valid persisted data
INFO: 2018/11/19 10:26:19.588340 [allocator 42:21:66:1c:43:ea] Initialising via deferred consensus
INFO: 2018/11/19 10:26:19.588408 Sniffing traffic on datapath (via ODP)
INFO: 2018/11/19 10:26:19.588911 ->[172.16.0.14:6783] attempting connection
INFO: 2018/11/19 10:26:19.589128 ->[172.16.0.7:6783] attempting connection
INFO: 2018/11/19 10:26:19.590644 ->[172.16.0.7:50067] connection accepted
INFO: 2018/11/19 10:26:19.591129 ->[172.16.0.14:6783] error during connection attempt: dial tcp4 :0->172.16.0.14:6783: connect: connection refused
INFO: 2018/11/19 10:26:19.591649 ->[172.16.0.7:50067|42:21:66:1c:43:ea(thesis-test-node-0)]: connection shutting down due to error: cannot connect to ourself
INFO: 2018/11/19 10:26:19.591986 ->[172.16.0.7:6783|42:21:66:1c:43:ea(thesis-test-node-0)]: connection shutting down due to error: cannot connect to ourself
INFO: 2018/11/19 10:26:19.594747 Listening for HTTP control messages on 127.0.0.1:6784
INFO: 2018/11/19 10:26:19.595033 Listening for metrics requests on 0.0.0.0:6782
INFO: 2018/11/19 10:26:20.165387 [kube-peers] Added myself to peer list &{[{42:21:66:1c:43:ea thesis-test-node-0}]}
DEBU: 2018/11/19 10:26:20.182226 [kube-peers] Nodes that have disappeared: map[]
INFO: 2018/11/19 10:26:22.343955 ->[172.16.0.14:6783] attempting connection
INFO: 2018/11/19 10:26:22.344454 ->[172.16.0.14:6783] error during connection attempt: dial tcp4 :0->172.16.0.14:6783: connect: connection refused
INFO: 2018/11/19 10:26:22.733128 ->[172.16.0.14:58157] connection accepted
INFO: 2018/11/19 10:26:22.743480 ->[172.16.0.14:58157|c6:6f:93:00:ed:2d(thesis-test-master)]: connection ready; using protocol version 2
INFO: 2018/11/19 10:26:22.743769 overlay_switch ->[c6:6f:93:00:ed:2d(thesis-test-master)] using fastdp
INFO: 2018/11/19 10:26:22.743898 ->[172.16.0.14:58157|c6:6f:93:00:ed:2d(thesis-test-master)]: connection added (new peer)
INFO: 2018/11/19 10:26:22.773344 ->[172.16.0.14:58157|c6:6f:93:00:ed:2d(thesis-test-master)]: connection fully established
10.32.0.1
INFO: 2018/11/19 10:26:22.811699 Discovered remote MAC c6:d8:31:34:af:1a at c6:6f:93:00:ed:2d(thesis-test-master)
INFO: 2018/11/19 10:26:22.812119 Discovered remote MAC 0a:f6:76:45:66:f4 at c6:6f:93:00:ed:2d(thesis-test-master)
172.16.0.14
172.16.0.7
DEBU: 2018/11/19 10:26:23.141074 registering for updates for node delete events
INFO: 2018/11/19 10:26:23.231335 Discovered remote MAC c6:6f:93:00:ed:2d at c6:6f:93:00:ed:2d(thesis-test-master)
INFO: 2018/11/19 10:26:23.249378 EMSGSIZE on send, expecting PMTU update (IP packet was 60028 bytes, payload was 60020 bytes)
INFO: 2018/11/19 10:26:23.251890 sleeve ->[172.16.0.14:6783|c6:6f:93:00:ed:2d(thesis-test-master)]: Effective MTU verified at 8888
INFO: 2018/11/19 10:58:04.192022 Discovered remote MAC 0a:f6:76:45:66:f4 at c6:6f:93:00:ed:2d(thesis-test-master)
INFO: 2018/11/19 11:00:48.028707 Discovered remote MAC c6:d8:31:34:af:1a at c6:6f:93:00:ed:2d(thesis-test-master)
WARN: 2018/11/19 11:00:57.189662 Vetoed installation of hairpin flow FlowSpec{keys: [EthernetFlowKey{src: 42:21:66:1c:43:ea, dst: 92:77:a6:28:80:15} InPortFlowKey{vport: 1}], actions: [OutputAction{vport: 1}]}
INFO: 2018/11/19 11:06:15.735723 Discovered remote MAC c6:6f:93:00:ed:2d at c6:6f:93:00:ed:2d(thesis-test-master)
INFO: 2018/11/19 11:28:06.466739 Discovered remote MAC 0a:f6:76:45:66:f4 at c6:6f:93:00:ed:2d(thesis-test-master)
INFO: 2018/11/19 11:34:39.677498 Discovered remote MAC c6:d8:31:34:af:1a at c6:6f:93:00:ed:2d(thesis-test-master)
INFO: 2018/11/19 11:45:35.058523 Discovered remote MAC c6:6f:93:00:ed:2d at c6:6f:93:00:ed:2d(thesis-test-master)

Network:

$ ip route
$ ip -4 -o addr
$ sudo iptables-save
bboreham commented 6 years ago

$ sudo iptables-save

could you show the output of this command too.

maxbischoff commented 6 years ago
$ sudo iptables-save
# Generated by iptables-save v1.6.1 on Mon Nov 19 12:23:23 2018
*filter
:INPUT ACCEPT [1036586:248096517]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [1036374:275960144]
:KUBE-FIREWALL - [0:0]
:KUBE-FORWARD - [0:0]
:WEAVE-NPC - [0:0]
:WEAVE-NPC-DEFAULT - [0:0]
:WEAVE-NPC-EGRESS - [0:0]
:WEAVE-NPC-EGRESS-ACCEPT - [0:0]
:WEAVE-NPC-EGRESS-CUSTOM - [0:0]
:WEAVE-NPC-EGRESS-DEFAULT - [0:0]
:WEAVE-NPC-INGRESS - [0:0]
-A INPUT -j KUBE-FIREWALL
-A INPUT -i weave -j WEAVE-NPC-EGRESS
-A FORWARD -i weave -m comment --comment "NOTE: this must go before \'-j KUBE-FORWARD\'" -j WEAVE-NPC-EGRESS
-A FORWARD -o weave -m comment --comment "NOTE: this must go before \'-j KUBE-FORWARD\'" -j WEAVE-NPC
-A FORWARD -o weave -m state --state NEW -j NFLOG --nflog-group 86
-A FORWARD -o weave -j DROP
-A FORWARD -i weave ! -o weave -j ACCEPT
-A FORWARD -o weave -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A OUTPUT -j KUBE-FIREWALL
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A WEAVE-NPC -m state --state RELATED,ESTABLISHED -j ACCEPT
-A WEAVE-NPC -d 224.0.0.0/4 -j ACCEPT
-A WEAVE-NPC -m physdev --physdev-out vethwe-bridge -j ACCEPT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-DEFAULT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-INGRESS
-A WEAVE-NPC-DEFAULT -m set --match-set weave-;rGqyMIl1HN^cfDki~Z$3]6!N dst -m comment --comment "DefaultAllow ingress isolation for namespace: default" -j ACCEPT
-A WEAVE-NPC-DEFAULT -m set --match-set weave-P.B|!ZhkAr5q=XZ?3}tMBA+0 dst -m comment --comment "DefaultAllow ingress isolation for namespace: kube-system" -j ACCEPT
-A WEAVE-NPC-DEFAULT -m set --match-set weave-Rzff}h:=]JaaJl/G;(XJpGjZ[ dst -m comment --comment "DefaultAllow ingress isolation for namespace: kube-public" -j ACCEPT
-A WEAVE-NPC-EGRESS -m state --state RELATED,ESTABLISHED -j ACCEPT
-A WEAVE-NPC-EGRESS -m physdev --physdev-in vethwe-bridge -j RETURN
-A WEAVE-NPC-EGRESS -d 224.0.0.0/4 -j RETURN
-A WEAVE-NPC-EGRESS -m state --state NEW -j WEAVE-NPC-EGRESS-DEFAULT
-A WEAVE-NPC-EGRESS -m state --state NEW -m mark ! --mark 0x40000/0x40000 -j WEAVE-NPC-EGRESS-CUSTOM
-A WEAVE-NPC-EGRESS -m state --state NEW -m mark ! --mark 0x40000/0x40000 -j NFLOG --nflog-group 86
-A WEAVE-NPC-EGRESS -m mark ! --mark 0x40000/0x40000 -j DROP
-A WEAVE-NPC-EGRESS-ACCEPT -j MARK --set-xmark 0x40000/0x40000
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-s_+ChJId4Uy_$}G;WdH|~TK)I src -m comment --comment "DefaultAllow egress isolation for namespace: default" -j WEAVE-NPC-EGRESS-ACCEPT
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-s_+ChJId4Uy_$}G;WdH|~TK)I src -m comment --comment "DefaultAllow egress isolation for namespace: default" -j RETURN
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-E1ney4o[ojNrLk.6rOHi;7MPE src -m comment --comment "DefaultAllow egress isolation for namespace: kube-system" -j WEAVE-NPC-EGRESS-ACCEPT
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-E1ney4o[ojNrLk.6rOHi;7MPE src -m comment --comment "DefaultAllow egress isolation for namespace: kube-system" -j RETURN
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-41s)5vQ^o/xWGz6a20N:~?#|E src -m comment --comment "DefaultAllow egress isolation for namespace: kube-public" -j WEAVE-NPC-EGRESS-ACCEPT
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-41s)5vQ^o/xWGz6a20N:~?#|E src -m comment --comment "DefaultAllow egress isolation for namespace: kube-public" -j RETURN
-A WEAVE-NPC-INGRESS -m set --match-set weave-{U;]TI.l|MdRzDhN7$NRn[t)d src -m set --match-set weave-KN[_+Gl.dlb1q$;v4h!E_Sg)( dst -m comment --comment "pods: namespace: default, selector: access=true -> pods: namespace: default, selector: run=nginx (ingress)" -j ACCEPT
COMMIT
# Completed on Mon Nov 19 12:23:23 2018
# Generated by iptables-save v1.6.1 on Mon Nov 19 12:23:23 2018
*nat
:PREROUTING ACCEPT [10:584]
:INPUT ACCEPT [10:584]
:OUTPUT ACCEPT [15:997]
:POSTROUTING ACCEPT [15:997]
:KUBE-FIREWALL - [0:0]
:KUBE-LOAD-BALANCER - [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODE-PORT - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-SERVICES - [0:0]
:WEAVE - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -j WEAVE
-A KUBE-FIREWALL -j KUBE-MARK-DROP
-A KUBE-LOAD-BALANCER -j KUBE-MARK-MASQ
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-NODE-PORT -j KUBE-MARK-MASQ
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
-A KUBE-POSTROUTING -m comment --comment "Kubernetes endpoints dst ip:port, source ip for solving hairpin purpose" -m set --match-set KUBE-LOOP-BACK dst,dst,src -j MASQUERADE
-A KUBE-SERVICES ! -s 10.32.0.0/12 -m comment --comment "Kubernetes service cluster ip + port for masquerade purpose" -m set --match-set KUBE-CLUSTER-IP dst,dst -j KUBE-MARK-MASQ
-A KUBE-SERVICES -m set --match-set KUBE-CLUSTER-IP dst,dst -j ACCEPT
-A WEAVE -s 10.32.0.0/12 -d 224.0.0.0/4 -j RETURN
-A WEAVE ! -s 10.32.0.0/12 -d 10.32.0.0/12 -j MASQUERADE
-A WEAVE -s 10.32.0.0/12 ! -d 10.32.0.0/12 -j MASQUERADE
COMMIT
# Completed on Mon Nov 19 12:23:23 2018
bboreham commented 6 years ago

Interesting - you don't have any iptables rules redirecting Kubernetes services. Are you using kube-proxy in ipvs mode?

I suspect the source address is getting masqueraded, so the connection no longer looks like it is coming from the pod.

maxbischoff commented 6 years ago

Yes, it's running in ipvs mode:

sudo ipvsadm -l
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  thesis-test-master:https rr
  -> 185.113.124.205:6443         Masq    1      4          0         
TCP  thesis-test-master:domain rr
  -> 10.32.0.3:domain             Masq    1      0          0         
  -> 10.32.0.4:domain             Masq    1      0          0         
TCP  thesis-test-master:http rr
  -> 10.32.0.2:http               Masq    1      0          0         
UDP  thesis-test-master:domain rr
  -> 10.32.0.3:domain             Masq    1      0          0         
  -> 10.32.0.4:domain             Masq    1      0          0    
murali-reddy commented 6 years ago

Observing this

/ # wget --spider -T 1 nginx
Connecting to nginx (10.96.140.192:80)
/ # wget --spider -T 1 10.32.0.2
Connecting to 10.32.0.2 (10.32.0.2:80)
wget: download timed out

it is interesting how access to service ip goes through. As network policy is applied in the namespace default rule is to drop the packet. Since we dont have any pods matching access: "true"' label all traffic should have been blocked.

Could you please share the ipset save and iptables-save output for both the nodes where busybox and nginx pods are running.

maxbischoff commented 6 years ago

Nodes & Pods:

$ kubectl get pods -o wide
NAME                       READY   STATUS    RESTARTS   AGE   IP          NODE                 NOMINATED NODE
busybox-7cd98849ff-4zn5s   1/1     Running   0          32m   10.32.0.5   thesis-test-node-0   <none>
nginx-6f858d4d45-dt6fw     1/1     Running   0          22h   10.32.0.2   thesis-test-node-0   <none>

$ kubectl get nodes
NAME                 STATUS   ROLES    AGE   VERSION
thesis-test-master   Ready    master   22h   v1.11.3
thesis-test-node-0   Ready    <none>   22h   v1.11.3

node/thesis-test-master :

$ sudo ipset save
create KUBE-NODE-PORT-LOCAL-UDP bitmap:port range 0-65535
create KUBE-LOOP-BACK hash:ip,port,ip family inet hashsize 1024 maxelem 65536
add KUBE-LOOP-BACK 185.113.124.205,tcp:6443,185.113.124.205
add KUBE-LOOP-BACK 10.32.0.3,udp:53,10.32.0.3
add KUBE-LOOP-BACK 10.32.0.4,tcp:53,10.32.0.4
add KUBE-LOOP-BACK 10.32.0.2,tcp:80,10.32.0.2
add KUBE-LOOP-BACK 10.32.0.4,udp:53,10.32.0.4
add KUBE-LOOP-BACK 10.32.0.3,tcp:53,10.32.0.3
create KUBE-EXTERNAL-IP hash:ip,port family inet hashsize 1024 maxelem 65536
create KUBE-LOAD-BALANCER hash:ip,port family inet hashsize 1024 maxelem 65536
create KUBE-LOAD-BALANCER-FW hash:ip,port family inet hashsize 1024 maxelem 65536
create KUBE-LOAD-BALANCER-SOURCE-IP hash:ip,port,ip family inet hashsize 1024 maxelem 65536
create KUBE-LOAD-BALANCER-SOURCE-CIDR hash:ip,port,net family inet hashsize 1024 maxelem 65536
create KUBE-CLUSTER-IP hash:ip,port family inet hashsize 1024 maxelem 65536
add KUBE-CLUSTER-IP 10.96.0.10,tcp:53
add KUBE-CLUSTER-IP 10.96.0.1,tcp:443
add KUBE-CLUSTER-IP 10.96.140.192,tcp:80
add KUBE-CLUSTER-IP 10.96.0.10,udp:53
create KUBE-LOAD-BALANCER-LOCAL hash:ip,port family inet hashsize 1024 maxelem 65536
create KUBE-NODE-PORT-TCP bitmap:port range 0-65535
create KUBE-NODE-PORT-LOCAL-TCP bitmap:port range 0-65535
create KUBE-NODE-PORT-UDP bitmap:port range 0-65535
create weave-;rGqyMIl1HN^cfDki~Z$3]6!N hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-;rGqyMIl1HN^cfDki~Z$3]6!N 10.32.0.5 comment "namespace: default, pod: busybox-7cd98849ff-4zn5s"
create weave-s_+ChJId4Uy_$}G;WdH|~TK)I hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-s_+ChJId4Uy_$}G;WdH|~TK)I 10.32.0.5 comment "namespace: default, pod: busybox-7cd98849ff-4zn5s"
add weave-s_+ChJId4Uy_$}G;WdH|~TK)I 10.32.0.2 comment "namespace: default, pod: nginx-6f858d4d45-dt6fw"
create weave-k?Z;25^M}|1s7P3|H9i;*;MhG hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-k?Z;25^M}|1s7P3|H9i;*;MhG 10.32.0.2 comment "namespace: default, pod: nginx-6f858d4d45-dt6fw"
add weave-k?Z;25^M}|1s7P3|H9i;*;MhG 10.32.0.5 comment "namespace: default, pod: busybox-7cd98849ff-4zn5s"
create weave-P.B|!ZhkAr5q=XZ?3}tMBA+0 hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-P.B|!ZhkAr5q=XZ?3}tMBA+0 10.32.0.3 comment "namespace: kube-system, pod: coredns-78fcdf6894-ddtfq"
add weave-P.B|!ZhkAr5q=XZ?3}tMBA+0 10.32.0.4 comment "namespace: kube-system, pod: coredns-78fcdf6894-85kxl"
create weave-E1ney4o[ojNrLk.6rOHi;7MPE hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-E1ney4o[ojNrLk.6rOHi;7MPE 10.32.0.3 comment "namespace: kube-system, pod: coredns-78fcdf6894-ddtfq"
add weave-E1ney4o[ojNrLk.6rOHi;7MPE 10.32.0.4 comment "namespace: kube-system, pod: coredns-78fcdf6894-85kxl"
create weave-iuZcey(5DeXbzgRFs8Szo]+@p hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-iuZcey(5DeXbzgRFs8Szo]+@p 10.32.0.3 comment "namespace: kube-system, pod: coredns-78fcdf6894-ddtfq"
add weave-iuZcey(5DeXbzgRFs8Szo]+@p 10.32.0.4 comment "namespace: kube-system, pod: coredns-78fcdf6894-85kxl"
create weave-Rzff}h:=]JaaJl/G;(XJpGjZ[ hash:ip family inet hashsize 1024 maxelem 65536 comment
create weave-41s)5vQ^o/xWGz6a20N:~?#|E hash:ip family inet hashsize 1024 maxelem 65536 comment
create weave-4vtqMI+kx/2]jD%_c0S%thO%V hash:ip family inet hashsize 1024 maxelem 65536 comment
create weave-KN[_+Gl.dlb1q$;v4h!E_Sg)( hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-KN[_+Gl.dlb1q$;v4h!E_Sg)( 10.32.0.2 comment "namespace: default, pod: nginx-6f858d4d45-dt6fw"
create weave-{U;]TI.l|MdRzDhN7$NRn[t)d hash:ip family inet hashsize 1024 maxelem 65536 comment
$ sudo iptables-save
# Generated by iptables-save v1.6.1 on Tue Nov 20 08:40:25 2018
*filter
:INPUT ACCEPT [57611:14060452]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [57453:15774391]
:KUBE-FIREWALL - [0:0]
:KUBE-FORWARD - [0:0]
:WEAVE-NPC - [0:0]
:WEAVE-NPC-DEFAULT - [0:0]
:WEAVE-NPC-EGRESS - [0:0]
:WEAVE-NPC-EGRESS-ACCEPT - [0:0]
:WEAVE-NPC-EGRESS-CUSTOM - [0:0]
:WEAVE-NPC-EGRESS-DEFAULT - [0:0]
:WEAVE-NPC-INGRESS - [0:0]
-A INPUT -j KUBE-FIREWALL
-A INPUT -i weave -j WEAVE-NPC-EGRESS
-A FORWARD -i weave -m comment --comment "NOTE: this must go before \'-j KUBE-FORWARD\'" -j WEAVE-NPC-EGRESS
-A FORWARD -o weave -m comment --comment "NOTE: this must go before \'-j KUBE-FORWARD\'" -j WEAVE-NPC
-A FORWARD -o weave -m state --state NEW -j NFLOG --nflog-group 86
-A FORWARD -o weave -j DROP
-A FORWARD -i weave ! -o weave -j ACCEPT
-A FORWARD -o weave -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A OUTPUT -j KUBE-FIREWALL
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A WEAVE-NPC -m state --state RELATED,ESTABLISHED -j ACCEPT
-A WEAVE-NPC -d 224.0.0.0/4 -j ACCEPT
-A WEAVE-NPC -m physdev --physdev-out vethwe-bridge -j ACCEPT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-DEFAULT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-INGRESS
-A WEAVE-NPC-DEFAULT -m set --match-set weave-;rGqyMIl1HN^cfDki~Z$3]6!N dst -m comment --comment "DefaultAllow ingress isolation for namespace: default" -j ACCEPT
-A WEAVE-NPC-DEFAULT -m set --match-set weave-P.B|!ZhkAr5q=XZ?3}tMBA+0 dst -m comment --comment "DefaultAllow ingress isolation for namespace: kube-system" -j ACCEPT
-A WEAVE-NPC-DEFAULT -m set --match-set weave-Rzff}h:=]JaaJl/G;(XJpGjZ[ dst -m comment --comment "DefaultAllow ingress isolation for namespace: kube-public" -j ACCEPT
-A WEAVE-NPC-EGRESS -m state --state RELATED,ESTABLISHED -j ACCEPT
-A WEAVE-NPC-EGRESS -m physdev --physdev-in vethwe-bridge -j RETURN
-A WEAVE-NPC-EGRESS -d 224.0.0.0/4 -j RETURN
-A WEAVE-NPC-EGRESS -m state --state NEW -j WEAVE-NPC-EGRESS-DEFAULT
-A WEAVE-NPC-EGRESS -m state --state NEW -m mark ! --mark 0x40000/0x40000 -j WEAVE-NPC-EGRESS-CUSTOM
-A WEAVE-NPC-EGRESS -m state --state NEW -m mark ! --mark 0x40000/0x40000 -j NFLOG --nflog-group 86
-A WEAVE-NPC-EGRESS -m mark ! --mark 0x40000/0x40000 -j DROP
-A WEAVE-NPC-EGRESS-ACCEPT -j MARK --set-xmark 0x40000/0x40000
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-s_+ChJId4Uy_$}G;WdH|~TK)I src -m comment --comment "DefaultAllow egress isolation for namespace: default" -j WEAVE-NPC-EGRESS-ACCEPT
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-s_+ChJId4Uy_$}G;WdH|~TK)I src -m comment --comment "DefaultAllow egress isolation for namespace: default" -j RETURN
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-E1ney4o[ojNrLk.6rOHi;7MPE src -m comment --comment "DefaultAllow egress isolation for namespace: kube-system" -j WEAVE-NPC-EGRESS-ACCEPT
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-E1ney4o[ojNrLk.6rOHi;7MPE src -m comment --comment "DefaultAllow egress isolation for namespace: kube-system" -j RETURN
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-41s)5vQ^o/xWGz6a20N:~?#|E src -m comment --comment "DefaultAllow egress isolation for namespace: kube-public" -j WEAVE-NPC-EGRESS-ACCEPT
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-41s)5vQ^o/xWGz6a20N:~?#|E src -m comment --comment "DefaultAllow egress isolation for namespace: kube-public" -j RETURN
-A WEAVE-NPC-INGRESS -m set --match-set weave-{U;]TI.l|MdRzDhN7$NRn[t)d src -m set --match-set weave-KN[_+Gl.dlb1q$;v4h!E_Sg)( dst -m comment --comment "pods: namespace: default, selector: access=true -> pods: namespace: default, selector: run=nginx (ingress)" -j ACCEPT
COMMIT
# Completed on Tue Nov 20 08:40:25 2018
# Generated by iptables-save v1.6.1 on Tue Nov 20 08:40:25 2018
*nat
:PREROUTING ACCEPT [1:60]
:INPUT ACCEPT [1:60]
:OUTPUT ACCEPT [9:637]
:POSTROUTING ACCEPT [9:637]
:KUBE-FIREWALL - [0:0]
:KUBE-LOAD-BALANCER - [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODE-PORT - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-SERVICES - [0:0]
:WEAVE - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -j WEAVE
-A KUBE-FIREWALL -j KUBE-MARK-DROP
-A KUBE-LOAD-BALANCER -j KUBE-MARK-MASQ
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-NODE-PORT -j KUBE-MARK-MASQ
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
-A KUBE-POSTROUTING -m comment --comment "Kubernetes endpoints dst ip:port, source ip for solving hairpin purpose" -m set --match-set KUBE-LOOP-BACK dst,dst,src -j MASQUERADE
-A KUBE-SERVICES ! -s 10.32.0.0/12 -m comment --comment "Kubernetes service cluster ip + port for masquerade purpose" -m set --match-set KUBE-CLUSTER-IP dst,dst -j KUBE-MARK-MASQ
-A KUBE-SERVICES -m set --match-set KUBE-CLUSTER-IP dst,dst -j ACCEPT
-A WEAVE -s 10.32.0.0/12 -d 224.0.0.0/4 -j RETURN
-A WEAVE ! -s 10.32.0.0/12 -d 10.32.0.0/12 -j MASQUERADE
-A WEAVE -s 10.32.0.0/12 ! -d 10.32.0.0/12 -j MASQUERADE
COMMIT
# Completed on Tue Nov 20 08:40:25 2018

node/thesis-test-node-0:

$ sudo ipset save
create KUBE-CLUSTER-IP hash:ip,port family inet hashsize 1024 maxelem 65536
add KUBE-CLUSTER-IP 10.96.0.10,udp:53
add KUBE-CLUSTER-IP 10.96.140.192,tcp:80
add KUBE-CLUSTER-IP 10.96.0.10,tcp:53
add KUBE-CLUSTER-IP 10.96.0.1,tcp:443
create KUBE-LOAD-BALANCER-LOCAL hash:ip,port family inet hashsize 1024 maxelem 65536
create KUBE-LOAD-BALANCER-SOURCE-CIDR hash:ip,port,net family inet hashsize 1024 maxelem 65536
create KUBE-NODE-PORT-TCP bitmap:port range 0-65535
create KUBE-NODE-PORT-LOCAL-TCP bitmap:port range 0-65535
create KUBE-LOOP-BACK hash:ip,port,ip family inet hashsize 1024 maxelem 65536
add KUBE-LOOP-BACK 185.113.124.205,tcp:6443,185.113.124.205
add KUBE-LOOP-BACK 10.32.0.2,tcp:80,10.32.0.2
add KUBE-LOOP-BACK 10.32.0.3,udp:53,10.32.0.3
add KUBE-LOOP-BACK 10.32.0.4,tcp:53,10.32.0.4
add KUBE-LOOP-BACK 10.32.0.4,udp:53,10.32.0.4
add KUBE-LOOP-BACK 10.32.0.3,tcp:53,10.32.0.3
create KUBE-LOAD-BALANCER hash:ip,port family inet hashsize 1024 maxelem 65536
create KUBE-LOAD-BALANCER-FW hash:ip,port family inet hashsize 1024 maxelem 65536
create KUBE-LOAD-BALANCER-SOURCE-IP hash:ip,port,ip family inet hashsize 1024 maxelem 65536
create KUBE-NODE-PORT-UDP bitmap:port range 0-65535
create KUBE-NODE-PORT-LOCAL-UDP bitmap:port range 0-65535
create KUBE-EXTERNAL-IP hash:ip,port family inet hashsize 1024 maxelem 65536
create weave-;rGqyMIl1HN^cfDki~Z$3]6!N hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-;rGqyMIl1HN^cfDki~Z$3]6!N 10.32.0.5 comment "namespace: default, pod: busybox-7cd98849ff-4zn5s"
create weave-s_+ChJId4Uy_$}G;WdH|~TK)I hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-s_+ChJId4Uy_$}G;WdH|~TK)I 10.32.0.2 comment "namespace: default, pod: nginx-6f858d4d45-dt6fw"
add weave-s_+ChJId4Uy_$}G;WdH|~TK)I 10.32.0.5 comment "namespace: default, pod: busybox-7cd98849ff-4zn5s"
create weave-k?Z;25^M}|1s7P3|H9i;*;MhG hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-k?Z;25^M}|1s7P3|H9i;*;MhG 10.32.0.5 comment "namespace: default, pod: busybox-7cd98849ff-4zn5s"
add weave-k?Z;25^M}|1s7P3|H9i;*;MhG 10.32.0.2 comment "namespace: default, pod: nginx-6f858d4d45-dt6fw"
create weave-P.B|!ZhkAr5q=XZ?3}tMBA+0 hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-P.B|!ZhkAr5q=XZ?3}tMBA+0 10.32.0.4 comment "namespace: kube-system, pod: coredns-78fcdf6894-85kxl"
add weave-P.B|!ZhkAr5q=XZ?3}tMBA+0 10.32.0.3 comment "namespace: kube-system, pod: coredns-78fcdf6894-ddtfq"
create weave-E1ney4o[ojNrLk.6rOHi;7MPE hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-E1ney4o[ojNrLk.6rOHi;7MPE 10.32.0.4 comment "namespace: kube-system, pod: coredns-78fcdf6894-85kxl"
add weave-E1ney4o[ojNrLk.6rOHi;7MPE 10.32.0.3 comment "namespace: kube-system, pod: coredns-78fcdf6894-ddtfq"
create weave-iuZcey(5DeXbzgRFs8Szo]+@p hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-iuZcey(5DeXbzgRFs8Szo]+@p 10.32.0.3 comment "namespace: kube-system, pod: coredns-78fcdf6894-ddtfq"
add weave-iuZcey(5DeXbzgRFs8Szo]+@p 10.32.0.4 comment "namespace: kube-system, pod: coredns-78fcdf6894-85kxl"
create weave-Rzff}h:=]JaaJl/G;(XJpGjZ[ hash:ip family inet hashsize 1024 maxelem 65536 comment
create weave-41s)5vQ^o/xWGz6a20N:~?#|E hash:ip family inet hashsize 1024 maxelem 65536 comment
create weave-4vtqMI+kx/2]jD%_c0S%thO%V hash:ip family inet hashsize 1024 maxelem 65536 comment
create weave-KN[_+Gl.dlb1q$;v4h!E_Sg)( hash:ip family inet hashsize 1024 maxelem 65536 comment
add weave-KN[_+Gl.dlb1q$;v4h!E_Sg)( 10.32.0.2 comment "namespace: default, pod: nginx-6f858d4d45-dt6fw"
create weave-{U;]TI.l|MdRzDhN7$NRn[t)d hash:ip family inet hashsize 1024 maxelem 65536 comment
$ sudo iptables-save
# Generated by iptables-save v1.6.1 on Tue Nov 20 09:09:49 2018
*filter
:INPUT ACCEPT [12497:7846349]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [21397:2464438]
:KUBE-FIREWALL - [0:0]
:KUBE-FORWARD - [0:0]
:WEAVE-NPC - [0:0]
:WEAVE-NPC-DEFAULT - [0:0]
:WEAVE-NPC-EGRESS - [0:0]
:WEAVE-NPC-EGRESS-ACCEPT - [0:0]
:WEAVE-NPC-EGRESS-CUSTOM - [0:0]
:WEAVE-NPC-EGRESS-DEFAULT - [0:0]
:WEAVE-NPC-INGRESS - [0:0]
-A INPUT -j KUBE-FIREWALL
-A INPUT -i weave -j WEAVE-NPC-EGRESS
-A FORWARD -i weave -m comment --comment "NOTE: this must go before \'-j KUBE-FORWARD\'" -j WEAVE-NPC-EGRESS
-A FORWARD -o weave -m comment --comment "NOTE: this must go before \'-j KUBE-FORWARD\'" -j WEAVE-NPC
-A FORWARD -o weave -m state --state NEW -j NFLOG --nflog-group 86
-A FORWARD -o weave -j DROP
-A FORWARD -i weave ! -o weave -j ACCEPT
-A FORWARD -o weave -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A OUTPUT -j KUBE-FIREWALL
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A WEAVE-NPC -m state --state RELATED,ESTABLISHED -j ACCEPT
-A WEAVE-NPC -d 224.0.0.0/4 -j ACCEPT
-A WEAVE-NPC -m physdev --physdev-out vethwe-bridge -j ACCEPT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-DEFAULT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-INGRESS
-A WEAVE-NPC-DEFAULT -m set --match-set weave-;rGqyMIl1HN^cfDki~Z$3]6!N dst -m comment --comment "DefaultAllow ingress isolation for namespace: default" -j ACCEPT
-A WEAVE-NPC-DEFAULT -m set --match-set weave-P.B|!ZhkAr5q=XZ?3}tMBA+0 dst -m comment --comment "DefaultAllow ingress isolation for namespace: kube-system" -j ACCEPT
-A WEAVE-NPC-DEFAULT -m set --match-set weave-Rzff}h:=]JaaJl/G;(XJpGjZ[ dst -m comment --comment "DefaultAllow ingress isolation for namespace: kube-public" -j ACCEPT
-A WEAVE-NPC-EGRESS -m state --state RELATED,ESTABLISHED -j ACCEPT
-A WEAVE-NPC-EGRESS -m physdev --physdev-in vethwe-bridge -j RETURN
-A WEAVE-NPC-EGRESS -d 224.0.0.0/4 -j RETURN
-A WEAVE-NPC-EGRESS -m state --state NEW -j WEAVE-NPC-EGRESS-DEFAULT
-A WEAVE-NPC-EGRESS -m state --state NEW -m mark ! --mark 0x40000/0x40000 -j WEAVE-NPC-EGRESS-CUSTOM
-A WEAVE-NPC-EGRESS -m state --state NEW -m mark ! --mark 0x40000/0x40000 -j NFLOG --nflog-group 86
-A WEAVE-NPC-EGRESS -m mark ! --mark 0x40000/0x40000 -j DROP
-A WEAVE-NPC-EGRESS-ACCEPT -j MARK --set-xmark 0x40000/0x40000
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-s_+ChJId4Uy_$}G;WdH|~TK)I src -m comment --comment "DefaultAllow egress isolation for namespace: default" -j WEAVE-NPC-EGRESS-ACCEPT
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-s_+ChJId4Uy_$}G;WdH|~TK)I src -m comment --comment "DefaultAllow egress isolation for namespace: default" -j RETURN
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-E1ney4o[ojNrLk.6rOHi;7MPE src -m comment --comment "DefaultAllow egress isolation for namespace: kube-system" -j WEAVE-NPC-EGRESS-ACCEPT
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-E1ney4o[ojNrLk.6rOHi;7MPE src -m comment --comment "DefaultAllow egress isolation for namespace: kube-system" -j RETURN
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-41s)5vQ^o/xWGz6a20N:~?#|E src -m comment --comment "DefaultAllow egress isolation for namespace: kube-public" -j WEAVE-NPC-EGRESS-ACCEPT
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-41s)5vQ^o/xWGz6a20N:~?#|E src -m comment --comment "DefaultAllow egress isolation for namespace: kube-public" -j RETURN
-A WEAVE-NPC-INGRESS -m set --match-set weave-{U;]TI.l|MdRzDhN7$NRn[t)d src -m set --match-set weave-KN[_+Gl.dlb1q$;v4h!E_Sg)( dst -m comment --comment "pods: namespace: default, selector: access=true -> pods: namespace: default, selector: run=nginx (ingress)" -j ACCEPT
COMMIT
# Completed on Tue Nov 20 09:09:49 2018
# Generated by iptables-save v1.6.1 on Tue Nov 20 09:09:49 2018
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [18:1421]
:POSTROUTING ACCEPT [18:1421]
:KUBE-FIREWALL - [0:0]
:KUBE-LOAD-BALANCER - [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODE-PORT - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-SERVICES - [0:0]
:WEAVE - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -j WEAVE
-A KUBE-FIREWALL -j KUBE-MARK-DROP
-A KUBE-LOAD-BALANCER -j KUBE-MARK-MASQ
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-NODE-PORT -j KUBE-MARK-MASQ
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
-A KUBE-POSTROUTING -m comment --comment "Kubernetes endpoints dst ip:port, source ip for solving hairpin purpose" -m set --match-set KUBE-LOOP-BACK dst,dst,src -j MASQUERADE
-A KUBE-SERVICES ! -s 10.32.0.0/12 -m comment --comment "Kubernetes service cluster ip + port for masquerade purpose" -m set --match-set KUBE-CLUSTER-IP dst,dst -j KUBE-MARK-MASQ
-A KUBE-SERVICES -m set --match-set KUBE-CLUSTER-IP dst,dst -j ACCEPT
-A WEAVE -s 10.32.0.0/12 -d 224.0.0.0/4 -j RETURN
-A WEAVE ! -s 10.32.0.0/12 -d 10.32.0.0/12 -j MASQUERADE
-A WEAVE -s 10.32.0.0/12 ! -d 10.32.0.0/12 -j MASQUERADE
COMMIT
# Completed on Tue Nov 20 09:09:49 2018
murali-reddy commented 6 years ago

@MaxBischoff thanks for sharing the iptable, ipset details. As pointed out by @bboreham it does look like to be the result of masquerading. Since both the pods are running on same node traffic getting masqueraded results in node's IP to be used as source IP. Network policies are generally implemented to allow any traffic from node local Ip's. so traffic is not run through the network policies.

Looking at the possible cases where kube-proxy IPVS masquerades the traffic, I don't expect traffic to getting masquraded in your deployment.

Would it possible to add additional node into the cluster (so as to ensure pods run on different node) and see if this scenario works. Alternatively if you can (by tcpdump traffic) see if the traffic is getting masquraded then that confirms why network polices are not imposed.

maxbischoff commented 6 years ago
kubectl get pods -o wide
NAME                       READY   STATUS    RESTARTS   AGE   IP          NODE                 NOMINATED NODE
busybox-7cd98849ff-l4gvx   1/1     Running   0          10s   10.32.0.2   thesis-test-node-1   <none>
nginx-64f497f8fd-j7ms8     1/1     Running   0          3m    10.40.0.2   thesis-test-node-0   <none>
$ kubectl run busybox --rm -ti --image=busybox /bin/sh
If you don't see a command prompt, try pressing enter.
/ # wget --spider -T 1 nginx
Connecting to nginx (10.96.113.180:80)
wget: download timed out

It seems like traffic is getting masqueraded

murali-reddy commented 6 years ago

@MaxBischoff thanks for confirming. You need to check what is causing kube-proxy to masquerade the traffic. Please note that in general semantics of network policies only deal with pod IP or ipblock and does not really cover how service proxy fits in.

bboreham commented 6 years ago

I wonder if it is hitting this line in our iptables rules:

-A WEAVE -s 10.32.0.0/12 ! -d 10.32.0.0/12 -j MASQUERADE

if the dnat done by ipvs happens after iptables run then I think it would.

murali-reddy commented 6 years ago

I just provisioned a cluster with kube-proxy in ipvs mode. Things are in-order as expected by weave. I dont see traffic getting masqueraded when service IP is accessed from the pods.

Chain POSTROUTING (policy ACCEPT 10 packets, 600 bytes)
pkts bytes target     prot opt in     out     source               destination
1445 97106 KUBE-POSTROUTING  all  --  *             *0.0.0.0/0            0.0.0.0/0            /* kubernetes postrouting rules */
   0     0 MASQUERADE  all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0
1248 83506 WEAVE      all  --  *      *       0.0.0.0/0            0.0.0.0/0
Chain KUBE-POSTROUTING (1 references)
pkts bytes target     prot opt in     out     source               destination
   0     0 MASQUERADE  all  --  *             *0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
   0     0 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            match-set KUBE-LOOP-BACK dst,dst,src
Chain WEAVE (1 references)
pkts bytes target     prot opt in     out     source               destination
   0     0 RETURN     all  --  *      *       100.96.0.0/11        224.0.0.0/4
   0     0 MASQUERADE  all  --  *      *      !100.96.0.0/11        100.96.0.0/11
   5   300 MASQUERADE  all  --  *      *       100.96.0.0/11       !100.96.0.0/11

@MaxBischoff if you still have the cluster and its not too much of trouble can you please share below command output when you run the test on the node where busybox running

iptables -t nat -L KUBE-POSTROUTING -n -v
iptables -t nat -L WEAVE -n -v

and please see which of the rule has pkts count getting incremented (basically resulting in masqurading the traffic)

maxbischoff commented 6 years ago

I tried it but it seems to not hit either:

$ sudo iptables -t nat -L KUBE-POSTROUTING -n -v
Chain KUBE-POSTROUTING (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
    0     0 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* Kubernetes endpoints dst ip:port, source ip for solving hairpin purpose */ match-set KUBE-LOOP-BACK dst,dst,src
$ sudo iptables -t nat -L WEAVE -n -v
Chain WEAVE (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 RETURN     all  --  *      *       10.32.0.0/12         224.0.0.0/4         
    0     0 MASQUERADE  all  --  *      *      !10.32.0.0/12         10.32.0.0/12        
    6   360 MASQUERADE  all  --  *      *       10.32.0.0/12        !10.32.0.0/12        

After running wget in busybox:

$ sudo iptables -t nat -L KUBE-POSTROUTING -n -v
Chain KUBE-POSTROUTING (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
    0     0 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* Kubernetes endpoints dst ip:port, source ip for solving hairpin purpose */ match-set KUBE-LOOP-BACK dst,dst,src
$ sudo iptables -t nat -L WEAVE -n -v
Chain WEAVE (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 RETURN     all  --  *      *       10.32.0.0/12         224.0.0.0/4         
    0     0 MASQUERADE  all  --  *      *      !10.32.0.0/12         10.32.0.0/12        
    6   360 MASQUERADE  all  --  *      *       10.32.0.0/12        !10.32.0.0/12     
aleks-mariusz commented 5 years ago

I believe I am affected with this as well - kubernetes 1.14.1 weave-net 2.5.1

$ kubectl get svc,pod -o wide
NAME                 TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE     SELECTOR
service/kubernetes   ClusterIP   10.12.12.1     <none>        443/TCP   5m51s   <none>
service/nginx        ClusterIP   10.12.13.235   <none>        80/TCP    6s      run=nginx

NAME                         READY   STATUS    RESTARTS   AGE   IP           NODE        NOMINATED NODE   READINESS GATES
pod/nginx-7db9fccd9b-hzkmd   1/1     Running   0          23s   10.12.19.2   bsiklk8w1   <none>           <none>
pod/nginx-7db9fccd9b-rll6v   1/1     Running   0          23s   10.12.19.1   bsiklk8w1   <none>           <none>
$ kubectl run busybox --rm -ti --image=busybox /bin/sh
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
If you don't see a command prompt, try pressing enter.
/ # wget --spider --timeout=1 10.12.13.235
Connecting to 10.12.13.235 (10.12.13.235:80)
/ # wget --spider --timeout=1 nginx 
Connecting to nginx (10.12.13.235:80)
/ # wget --spider --timeout=1 10.12.19.2
Connecting to 10.12.19.2 (10.12.19.2:80)
wget: download timed out
/ # wget --spider --timeout=1 10.12.19.1
Connecting to 10.12.19.1 (10.12.19.1:80)
wget: download timed out

the attempts listed above are as follows:

  1. connecting to service IP, is not blocked
  2. connecting to "nginx" service, is not blocked (unexpected)
  3. connecting to pod IP, is blocked (as expected)
  4. connecting to other pod IP, is also blocked (as expected)

More log files and settings also captured

murali-reddy commented 5 years ago

@aleks-mariusz Its not clear the context of your problem. Have you applied any network policies? Perhaps opening a new issues with all the details would be helpful

As far as this issue is concerned network policy did not work when service is accessed as traffic is getting masquraded as noted in this comment in which case it is known (nothing specific to weave NPC, but general network policies does not work well with services) that network policies would not work as expected.

aleks-mariusz commented 5 years ago

sorry for not being clear, i piggybacked off this issue because i'm using the exact same tests that the OP did, and also am using IPVS. so I guess I am trying to give more data-points to try to help get this particular issue worked-on/resolved rather than opening what seems like a duplicate issue (unless that's preferred?)

btw/fyi, traffic to services are blocked properly by weave-npc, but only when having kube-proxy use legacy iptables mode.. it's just that when kube-proxy uses IPVS that services are not properly blocked.

unfortunately this will mean i will have to revert my weave-powered kubernetes cluster back to iptables mode until weave-npc supports IPVS mode as well as it's supported iptables..

murali-reddy commented 5 years ago

thanks @aleks-mariusz I did try kube-proxy in IPVS mode for this issue, did not run into any issue.

Can you please see

https://github.com/kubernetes/kubernetes/blob/master/pkg/proxy/ipvs/README.md#when-ipvs-falls-back-to-iptables

and check if in ipvs mode, --masquerade-all=true or --cluster-cidr is specified?

Can you also confirm if counters are increasing when you do watch "iptables -t nat -L KUBE-POSTROUTING -n -v"?

aleks-mariusz commented 5 years ago

re the masquerade-all is set to false and i am not specifying cluster-cidr with the kube-proxy config.. i will need to check the counters still, but not sure what that will tell us

murali-reddy commented 5 years ago

i will need to check the counters still, but not sure what that will tell us

there are two cases in which network policies does not work as expected. When traffic gets MASQUERADE either by kube-proxy or weave which changes source IP of the packet, which results in ingress network policies to not work as expected

other case is in which all traffic from the node is allowed to the pods running on the node (https://github.com/weaveworks/weave/issues/3285). So if you are running on single node, or in case where source and destination pods are on same node then you will see network policies not working as epxected