weaveworks / weave

Simple, resilient multi-host containers networking and more.
https://www.weave.works
Apache License 2.0
6.62k stars 670 forks source link

Cannot curl a container, but can ping it #2842

Open panuhorsmalahti opened 7 years ago

panuhorsmalahti commented 7 years ago

This issue is similar to #2433, and especially similar to https://github.com/weaveworks/weave/issues/2433#issuecomment-265993860

Created a new issue as per https://github.com/weaveworks/weave/issues/2433#issuecomment-285688668

I can ping and curl the container IP from the host, and can ping the container from another container, but curl doesn't work inside a container.

Version information: docker: Docker version 1.12.5, build 047e51b/1.12.5 kontena version: 1.1.2 Weave version: 1.8.2 Linux: Linux version 3.10.0-514.6.1.el7.x86_64 (mockbuild@x86-030.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Sat Dec 10 11:15:38 EST 2016 Storage driver: devicemapper OS: Red Hat Enterprise Linux

[root]# ip addr
2817: vethwepl7963@if2816: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 36:d6:20:80:bd:80 brd ff:ff:ff:ff:ff:ff link-netnsid 36
2561: veth138dd2e@if2560: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether ae:85:5f:3b:4a:c6 brd ff:ff:ff:ff:ff:ff link-netnsid 18
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:50:56:b9:bb:b3 brd ff:ff:ff:ff:ff:ff
    inet 10.193.38.14/26 brd 10.193.38.63 scope global ens192
       valid_lft forever preferred_lft forever
2563: vethwepl7566@if2562: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether ee:3b:7e:50:bb:4f brd ff:ff:ff:ff:ff:ff link-netnsid 17
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 02:42:85:79:0f:74 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
4: datapath: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue state UNKNOWN qlen 1000
    link/ether d6:34:e1:5e:3b:ae brd ff:ff:ff:ff:ff:ff
2565: vethwepl8134@if2564: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 32:49:00:30:19:55 brd ff:ff:ff:ff:ff:ff link-netnsid 18
6: weave: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue state UP qlen 1000
    link/ether 4a:b4:26:7b:6c:89 brd ff:ff:ff:ff:ff:ff
    inet 10.81.0.1/16 scope global weave
       valid_lft forever preferred_lft forever
2567: vethbc67172@if2566: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether 5e:ad:c4:17:a1:54 brd ff:ff:ff:ff:ff:ff link-netnsid 19
7: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether ee:0c:2e:b5:34:96 brd ff:ff:ff:ff:ff:ff
3337: vethwepl19161@if3336: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 8e:09:f4:81:4a:b1 brd ff:ff:ff:ff:ff:ff link-netnsid 45
9: vethwe-datapath@vethwe-bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master datapath state UP qlen 1000
    link/ether 2a:9e:2c:13:04:ca brd ff:ff:ff:ff:ff:ff
10: vethwe-bridge@vethwe-datapath: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP qlen 1000
    link/ether 76:b7:5a:73:fe:06 brd ff:ff:ff:ff:ff:ff
2571: vethwepl8796@if2570: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 72:68:00:78:c0:00 brd ff:ff:ff:ff:ff:ff link-netnsid 19
3341: vethwepl19781@if3340: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 0a:94:a5:96:7f:34 brd ff:ff:ff:ff:ff:ff link-netnsid 46
2575: vethwepl9143@if2574: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 62:04:15:25:85:ae brd ff:ff:ff:ff:ff:ff link-netnsid 20
3345: vethwepl20347@if3344: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 9a:47:0f:20:e3:b9 brd ff:ff:ff:ff:ff:ff link-netnsid 26
2577: vethwepl9586@if2576: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 56:af:fa:68:86:09 brd ff:ff:ff:ff:ff:ff link-netnsid 21
3603: veth0fedb9d@if3602: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether fe:71:0d:65:6f:45 brd ff:ff:ff:ff:ff:ff link-netnsid 6
3347: vethffb0999@if3346: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether 7e:ff:ae:24:e8:2c brd ff:ff:ff:ff:ff:ff link-netnsid 29
2579: vetha23e2bc@if2578: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether 52:66:da:fe:79:e1 brd ff:ff:ff:ff:ff:ff link-netnsid 22
3605: vethwepl9350@if3604: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 76:6d:b8:56:94:b8 brd ff:ff:ff:ff:ff:ff link-netnsid 6
3349: vethwepl26401@if3348: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 86:80:62:de:ae:9b brd ff:ff:ff:ff:ff:ff link-netnsid 29
2581: veth7aae92b@if2580: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether 1e:26:2e:43:ef:27 brd ff:ff:ff:ff:ff:ff link-netnsid 23
2583: vethwepl10354@if2582: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 1e:b0:24:75:f5:c0 brd ff:ff:ff:ff:ff:ff link-netnsid 22
2585: veth16cd5c7@if2584: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether 2a:d1:59:19:19:aa brd ff:ff:ff:ff:ff:ff link-netnsid 24
2587: vethwepl10748@if2586: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 2e:07:33:cb:84:20 brd ff:ff:ff:ff:ff:ff link-netnsid 23
2591: vethwepl11214@if2590: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether fa:9f:4e:9f:5b:c4 brd ff:ff:ff:ff:ff:ff link-netnsid 24
2595: vethwepl11666@if2594: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 3e:38:8a:c3:47:7c brd ff:ff:ff:ff:ff:ff link-netnsid 25
3365: vethwepl8386@if3364: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 2e:1d:4a:6b:5b:6f brd ff:ff:ff:ff:ff:ff link-netnsid 30
3381: vethwepl10930@if3380: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether b2:65:33:4a:88:81 brd ff:ff:ff:ff:ff:ff link-netnsid 43
3385: vethwepl11609@if3384: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 12:cc:58:09:10:bb brd ff:ff:ff:ff:ff:ff link-netnsid 44
3389: vethwepl12234@if3388: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 4a:06:e6:ee:90:ac brd ff:ff:ff:ff:ff:ff link-netnsid 33
3391: veth55f29a7@if3390: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether aa:c0:5e:ce:40:b9 brd ff:ff:ff:ff:ff:ff link-netnsid 0
3393: vethwepl13508@if3392: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 7a:99:df:df:24:57 brd ff:ff:ff:ff:ff:ff link-netnsid 0
3137: vethwepl24638@if3136: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 9a:e4:41:e2:15:60 brd ff:ff:ff:ff:ff:ff link-netnsid 53
3395: vethd0ab2fa@if3394: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether 86:16:c7:ea:df:2b brd ff:ff:ff:ff:ff:ff link-netnsid 1
3397: vethwepl13154@if3396: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 36:6e:2d:76:1a:7f brd ff:ff:ff:ff:ff:ff link-netnsid 1
2653: vethwepl6228@if2652: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 32:47:99:04:80:d1 brd ff:ff:ff:ff:ff:ff link-netnsid 31
2411: vxlan-6784: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65470 qdisc noqueue master datapath state UNKNOWN qlen 1000
    link/ether fe:aa:25:88:65:72 brd ff:ff:ff:ff:ff:ff
2701: vethwepl17929@if2700: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 6e:35:fa:99:bc:a9 brd ff:ff:ff:ff:ff:ff link-netnsid 28
2705: vethwepl18402@if2704: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 4e:da:f8:65:c6:88 brd ff:ff:ff:ff:ff:ff link-netnsid 37
2709: vethwepl18940@if2708: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 26:33:50:bb:ce:5d brd ff:ff:ff:ff:ff:ff link-netnsid 38
2713: vethwepl19668@if2712: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether d6:e9:30:83:c3:9c brd ff:ff:ff:ff:ff:ff link-netnsid 39
3753: vethwepl23259@if3752: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 96:08:f6:02:73:6d brd ff:ff:ff:ff:ff:ff link-netnsid 9
3757: vethwepl23941@if3756: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether da:5b:8e:10:8d:9e brd ff:ff:ff:ff:ff:ff link-netnsid 10
2479: vethwepl6832@if2478: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 1a:6f:72:3e:6b:41 brd ff:ff:ff:ff:ff:ff link-netnsid 14
3761: vethwepl24474@if3760: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 02:d5:ed:85:db:e7 brd ff:ff:ff:ff:ff:ff link-netnsid 4
3779: veth9ebf941@if3778: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether 72:5e:91:e1:fc:81 brd ff:ff:ff:ff:ff:ff link-netnsid 2
3781: vethwepl30712@if3780: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 46:6e:c3:b7:02:50 brd ff:ff:ff:ff:ff:ff link-netnsid 2
3793: vethwepl32388@if3792: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 26:d8:c4:4e:a6:76 brd ff:ff:ff:ff:ff:ff link-netnsid 5
3795: veth31fb42f@if3794: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether 2a:4f:d0:ca:f2:f2 brd ff:ff:ff:ff:ff:ff link-netnsid 8
3797: vethwepl590@if3796: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 86:9d:2f:95:3b:ed brd ff:ff:ff:ff:ff:ff link-netnsid 8
3799: veth2be23ec@if3798: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether ea:ba:d7:2e:25:c8 brd ff:ff:ff:ff:ff:ff link-netnsid 12
3801: vethwepl1331@if3800: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether c2:08:f7:8c:74:7e brd ff:ff:ff:ff:ff:ff link-netnsid 12
3803: veth9776d0d@if3802: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether ba:fc:6d:3a:af:9f brd ff:ff:ff:ff:ff:ff link-netnsid 3
3805: vethwepl1894@if3804: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 5e:ea:d9:6c:e1:f9 brd ff:ff:ff:ff:ff:ff link-netnsid 3
2789: vethwepl23742@if2788: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 96:81:59:9e:0e:9f brd ff:ff:ff:ff:ff:ff link-netnsid 32
2533: vethwepl23876@if2532: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 1a:5c:38:b4:9d:00 brd ff:ff:ff:ff:ff:ff link-netnsid 11
2537: vethwepl24545@if2536: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 12:7d:df:ca:79:19 brd ff:ff:ff:ff:ff:ff link-netnsid 13
2795: vethwepl24844@if2794: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 1e:5e:bf:07:1d:db brd ff:ff:ff:ff:ff:ff link-netnsid 27
2797: vethwepl24989@if2796: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether 36:7b:54:87:9b:99 brd ff:ff:ff:ff:ff:ff link-netnsid 41
2541: vethwepl25828@if2540: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether fe:62:ba:24:00:2f brd ff:ff:ff:ff:ff:ff link-netnsid 15
2545: vethwepl28113@if2544: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether da:aa:21:6b:78:27 brd ff:ff:ff:ff:ff:ff link-netnsid 16
3321: vethwepl16816@if3320: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether be:90:4b:41:c5:9b brd ff:ff:ff:ff:ff:ff link-netnsid 42
2813: vethwepl7304@if2812: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP 
    link/ether f6:03:b1:7d:d2:76 brd ff:ff:ff:ff:ff:ff link-netnsid 35
2815: vethce44706@if2814: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether 52:7c:1c:f6:bb:52 brd ff:ff:ff:ff:ff:ff link-netnsid 36

weave-zombie-hunter.sh output: https://gist.github.com/panuhorsmalahti/302d520353fb23196fc8c179925ce501

arping output:

Host:
arping -I weave 10.81.128.20
ARPING 10.81.128.20 from 10.81.0.1 weave
Unicast reply from 10.81.128.20 [82:D0:28:6B:7B:42]  0.682ms
Unicast reply from 10.81.128.20 [06:FC:01:5E:B5:FA]  0.706ms
Unicast reply from 10.81.128.20 [FE:FA:0B:55:78:10]  0.720ms
Unicast reply from 10.81.128.20 [FE:FA:0B:55:78:10]  0.530ms

Container:
arping 10.81.128.20         
ARPING 10.81.128.20
42 bytes from 82:d0:28:6b:7b:42 (10.81.128.20): index=0 time=5.955 msec
42 bytes from 06:fc:01:5e:b5:fa (10.81.128.20): index=1 time=6.010 msec
42 bytes from fe:fa:0b:55:78:10 (10.81.128.20): index=2 time=6.034 msec
42 bytes from 82:d0:28:6b:7b:42 (10.81.128.20): index=3 time=8.709 msec
panuhorsmalahti commented 7 years ago

As per https://github.com/weaveworks/weave/issues/2433#issuecomment-285726645 I ran the weave-zombie-hunter.sh again after redownloading the scripts from https://gist.github.com/SpComb/9cc2a7ed46ff708547d09854b78c197f#file-weave-zombie-hunter-sh and after I had executed the "behead" commands given from the output of the first run:

https://gist.github.com/panuhorsmalahti/20d76053c7bc370b97f18fe476f91a3e

I see some ERRO: 2017/03/10 09:24:47.188382 Captured frame from MAC (ba:ad:e6:6a:11:0f) to (02:68:2f:29:fb:27) associated with another peer ba:ad:e6:6a:11:0f(app-dev.itidm.domain.com) from docker logs weave, about 1600 times in total.

ip -d link show vethwe-bridge
10: vethwe-bridge@vethwe-datapath: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue master weave state UP mode DEFAULT qlen 1000
    link/ether 76:b7:5a:73:fe:06 brd ff:ff:ff:ff:ff:ff promiscuity 1 
    veth 
    bridge_slave addrgenmode eui64 
cat weave-log.txt | grep hairpin
ERRO: 2017/03/06 11:41:10.241698 Vetoed installation of hairpin flow FlowSpec{keys: [EthernetFlowKey{src: 1e:fa:7a:e2:79:29, dst: de:41:89:b6:e5:3e} InPortFlowKey{vport: 1}], actions: [OutputAction{vport: 1}]}
ERRO: 2017/03/06 11:41:10.453088 Vetoed installation of hairpin flow FlowSpec{keys: [EthernetFlowKey{src: 1e:fa:7a:e2:79:29, dst: de:41:89:b6:e5:3e} InPortFlowKey{vport: 1}], actions: [OutputAction{vport: 1}]}
ERRO: 2017/03/06 11:41:10.665129 Vetoed installation of hairpin flow FlowSpec{keys: [EthernetFlowKey{src: 1e:fa:7a:e2:79:29, dst: de:41:89:b6:e5:3e} InPortFlowKey{vport: 1}], actions: [OutputAction{vport: 1}]}
ERRO: 2017/03/06 11:41:11.090084 Vetoed installation of hairpin flow FlowSpec{keys: [EthernetFlowKey{src: 1e:fa:7a:e2:79:29, dst: de:41:89:b6:e5:3e} InPortFlowKey{vport: 1}], actions: [OutputAction{vport: 1}]}
ERRO: 2017/03/06 11:41:11.940130 Vetoed installation of hairpin flow FlowSpec{keys: [EthernetFlowKey{src: 1e:fa:7a:e2:79:29, dst: de:41:89:b6:e5:3e} InPortFlowKey{vport: 1}], actions: [OutputAction{vport: 1}]}
ERRO: 2017/03/06 11:41:13.640174 Vetoed installation of hairpin flow FlowSpec{keys: [EthernetFlowKey{src: 1e:fa:7a:e2:79:29, dst: de:41:89:b6:e5:3e} InPortFlowKey{vport: 1}], actions: [OutputAction{vport: 1}]}
ERRO: 2017/03/06 11:41:17.036124 Vetoed installation of hairpin flow FlowSpec{keys: [InPortFlowKey{vport: 1} EthernetFlowKey{src: 1e:fa:7a:e2:79:29, dst: de:41:89:b6:e5:3e}], actions: [OutputAction{vport: 1}]}
ERRO: 2017/03/06 11:41:23.836105 Vetoed installation of hairpin flow FlowSpec{keys: [EthernetFlowKey{src: 1e:fa:7a:e2:79:29, dst: de:41:89:b6:e5:3e} InPortFlowKey{vport: 1}], actions: [OutputAction{vport: 1}]}
ERRO: 2017/03/06 11:41:37.420122 Vetoed installation of hairpin flow FlowSpec{keys: [EthernetFlowKey{src: 1e:fa:7a:e2:79:29, dst: de:41:89:b6:e5:3e} InPortFlowKey{vport: 1}], actions: [OutputAction{vport: 1}]}
ERRO: 2017/03/06 11:41:39.424102 Vetoed installation of hairpin flow FlowSpec{keys: [EthernetFlowKey{src: 1e:fa:7a:e2:79:29, dst: de:41:89:b6:e5:3e} InPortFlowKey{vport: 1}], actions: [OutputAction{vport: 1}]}
ERRO: 2017/03/10 13:43:52.773415 Vetoed installation of hairpin flow FlowSpec{keys: [EthernetFlowKey{src: e2:7f:bd:e3:db:d3, dst: 8e:83:29:cd:26:e9} InPortFlowKey{vport: 1}], actions: [OutputAction{vport: 1}]}
ERRO: 2017/03/10 13:43:53.710120 Vetoed installation of hairpin flow FlowSpec{keys: [EthernetFlowKey{src: e2:7f:bd:e3:db:d3, dst: 8e:83:29:cd:26:e9} InPortFlowKey{vport: 1}], actions: [OutputAction{vport: 1}]}

I can see no errors apart from Vetoed installation of hairpin and Captured frame from MAC.

SpComb commented 7 years ago

Your weave-zombie-hunter.sh output seems odd, the and the specifics of your symptoms are also different than what I had in https://github.com/weaveworks/weave/issues/2433#issuecomment-265993860.

You also have multiple MAC addresses responding to ARP queries for the same weave overlay IP address, which is similar. Howver, you have one set of MAC addresses that are present in the bridge fdb show of multiple vethwepl* interfaces, which is unexpected.

Can you post the full ip link, weaveexec ps and bridge fdb show br weave output?

The way I read the script, if it prints id=8dfc61bc31d8 ip=10.81.128.55/16 then it should not subsequently print missing for the same veth. Yet most of your veths have both.

Not sure what's happening here either, it should not output both...

panuhorsmalahti commented 7 years ago

Well, I finally had to reboot the server and after reboot I didn't see any weave errors. I'll report back if the problem resurfaces.

SpComb commented 7 years ago

There's some kind of #2433-ish confusion going on here in any case, because there's 42 master weave veths but only 13 master docker0 veths... there should always be fewer weave veths than Docker veths, the remainder of those vethwepl* interfaces are presumably leftovers from dead containers?

The way I read the script, if it prints id=8dfc61bc31d8 ip=10.81.128.55/16 then it should not subsequently print missing for the same veth. Yet most of your veths have both.

Not sure what's happening here either, it should not output both...

Right, that was just some Bash subprocess confusion...

Howver, you have one set of MAC addresses that are present in the bridge fdb show of multiple vethwepl* interfaces, which is unexpected.

It seems like bridge fdb show ignores any unknown parameters: bridge fdb show br weave brportasdjfio vethwepl10181 just shows all bridge ports... :angry: Maybe your version of bridge doesn't know about brport? That would cause that kind of output..

Updated the gist with fixes for those issues...

bboreham commented 7 years ago

I'm not sure how we can make progress on this unless someone is able to recreate it.

We'll leave it open for a week to see if more information becomes available.

panuhorsmalahti commented 7 years ago

I think someone else on Kontena Slack channel reported to also have this issue.

I just had this issue again.

Output of updated weave-zombie-hunter.sh by @SpComb:

[root]# ./weave-zombie-hunter.sh
#   missing docker netns vethwepl10913 ifpeer=42
# vethwepl10913: unknown
#   zombie veth vethwepl12545: pid=12545
#   missing docker netns vethwepl12545 ifpeer=426
# vethwepl12545: zombie
ip link set vethwepl12545 nomaster
#   found weaveps vethwepl16371 mac=62:62:1b:81:7b:f6: id=d2fed61d2cab ip=10.81.128.103/16
#   missing docker netns vethwepl16371 ifpeer=64
# vethwepl16371: alive
#   zombie veth vethwepl17581: pid=17581
#   missing docker netns vethwepl17581 ifpeer=386
# vethwepl17581: zombie
ip link set vethwepl17581 nomaster
#   missing docker netns vethwepl24590 ifpeer=390
# vethwepl24590: unknown
#   missing docker netns vethwepl25256 ifpeer=394
# vethwepl25256: unknown
#   missing docker netns vethwepl26808 ifpeer=398
# vethwepl26808: unknown
#   missing docker netns vethwepl30575 ifpeer=430
# vethwepl30575: unknown
#   zombie veth vethwepl3556: pid=3556
#   missing docker netns vethwepl3556 ifpeer=402
# vethwepl3556: zombie
ip link set vethwepl3556 nomaster
#   missing docker netns vethwepl4565 ifpeer=406
# vethwepl4565: unknown
#   zombie veth vethwepl4858: pid=4858
#   missing docker netns vethwepl4858 ifpeer=14
# vethwepl4858: zombie
ip link set vethwepl4858 nomaster
#   found weaveps vethwepl5409 mac=ce:e2:2a:e9:f3:27: id=eba49edb9660 ip=10.81.128.22/16
#   missing docker netns vethwepl5409 ifpeer=18
# vethwepl5409: alive
#   zombie veth vethwepl5861: pid=5861
#   missing docker netns vethwepl5861 ifpeer=22
# vethwepl5861: zombie
ip link set vethwepl5861 nomaster
#   missing docker netns vethwepl630 ifpeer=438
# vethwepl630: unknown
#   missing docker netns vethwepl6572 ifpeer=26
# vethwepl6572: unknown
#   zombie veth vethwepl7529: pid=7529
#   missing docker netns vethwepl7529 ifpeer=30
# vethwepl7529: zombie
ip link set vethwepl7529 nomaster
#   zombie veth vethwepl8217: pid=8217
#   missing docker netns vethwepl8217 ifpeer=34
# vethwepl8217: zombie
ip link set vethwepl8217 nomaster
#   zombie veth vethwepl9663: pid=9663
#   missing docker netns vethwepl9663 ifpeer=38
# vethwepl9663: zombie
ip link set vethwepl9663 nomaster
SpComb commented 7 years ago

That seems to be saying missing docker netns for every veth, so I suspect the docker-netns-addrs.sh must be broken on your system... it was written and tested on CoreOS. What does it output if you run it directly?

The bridge fdb diagnostics seem to be less broken now... seems like for both of the active MACs that it found, they were legit running containers.

It does still look like it's finding actual zombie veths. Unfortunately this diagnosis still doesn't say anything about why those Docker network namespaces are being left behind after being removed.

panuhorsmalahti commented 7 years ago

docker-netns-addrs.sh output is and was empty.

SpComb commented 7 years ago

I think someone else on Kontena Slack channel reported to also have this issue.

I've heard of three cases of this happening recently: once on the weave slack, and twice in the kontena slack. Both the weave slack case and your case were with CentOS/RHEL... I suspect there's something going on with the RHEL 3.10 kernel, and this doesn't happen with the CoreOS 4.x kernels.

docker-netns-addrs.sh output is and was empty.

I guess the namespaces are somehow different in CentOS/RHEL than CoreOS...

mudspot commented 7 years ago

Hi all, I think I'm the one who reported over the Kontena Slack channel.

I am using default Kontena images. So it happened on CoreOS too.

I've been assign (by my company) to attempt to replicate this issue. ETA 3 day. Let's hope I can get some more info.

Anything I should log down?

brb commented 7 years ago

@mudspot would it be possible to get an access to your cluster? I'm "martynas" on http://weave-community.slack.com/.

mudspot commented 7 years ago

@brb I'll let you have access to my dev cluster, when I have manage to replicate the error state.

SpComb commented 7 years ago

When running with Kontena and this is happening, does a docker stop kontena-cadvisor make the zombie veths go away, and allow curl (TCP connections to the service) to start working again?

SpComb commented 7 years ago

This is easy to reproduce when running a cAdvisor container using the default /rootfs and /var/run bind mounts in the default rprivate mount propagation mode. Any Docker containers that are running when the cAdvisor container is started will become zombies once stopped: https://github.com/kontena/kontena/issues/2004.

When (re)starting the cadvisor container, it will pick up the netns mounts for any currently running Docker containers, and those will remain mounted in the cadvisor container's mount namespace after Docker unmounts them, keeping the netns alive even after Docker considers the container to be destroyed.

This isn't that noticable on CoreOS, since the libnetwork GC will unlink the lingering mountpoint within ~60s, and with the Linux 4.9 kernel, this causes the netns to also be unmoutned within the cadvisor container. This is much worse on the CentOS 7 Linux 3.10 kernel, where the netns will remain mounted indefinitely within the cadvisor container: https://github.com/kontena/kontena/issues/2004#issuecomment-288004665

mudspot commented 7 years ago

@SpComb tried your suggestion.

It does not start working again.

SpComb commented 7 years ago

Yes, it looks like there are two distinct cases here... the one described in https://github.com/kontena/kontena/issues/2004 is related to containers running when cadvisor starts, and it causes temporary zombies on CoreOS, and persistent zombies on CentOS.

Then there is some second issue that also affects CoreOS, and leads to persistent zombie netns's which I'm unable to find any references to. This is presumably some race condition, and may or may not be related to cadvisor.

SpComb commented 7 years ago

Then there is some second issue that also affects CoreOS, and leads to persistent zombie netns's which I'm unable to find any references to. This is presumably some race condition, and may or may not be related to cadvisor.

There's some clue that the CoreOS / Linux 4.9 case might be related to the kontena/openvpn container:

No theory as to how the OpenVPN server container triggers this issue though. It's a normal network-namespaced Docker container, but it has cap NET_ADMIN, with a tun device and iptables NAT rules within the container.

SpComb commented 7 years ago

Some intensive tracing work (see kernel-trace.sh) reveals that it's not the openvpn container... it's actually the docker-proxy processes that are leaking the network namespaces, via some kind of netlink sockets: https://gist.github.com/SpComb/f83c53b05aad79c6213185a3da7a7902

A host has a number of zombie weave veths:

# vethwepl1541: pid=1541 ifpeer=108
#   zombie pid=1541
# vethwepl1704: pid=1704 ifpeer=44
#   zombie pid=1704
# vethwepl1907: pid=1907 ifpeer=50
#   zombie pid=1907
# vethwepl1913: pid=1913 ifpeer=52
#   zombie pid=1913
# vethwepl1966: pid=1966 ifpeer=62
#   zombie pid=1966
# vethwepl2353: pid=2353 ifpeer=124
#   zombie pid=2353
# vethwepl5101: pid=5101 ifpeer=128
#   zombie pid=5101

Two docker-proxy processes are running, mapping ports for a kontena/lb service:

root      7054  0.0  0.0  51732  1464 ?        Sl   Mar20   0:00  \_ /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 443 -container-ip 172.17.0.4 -container-port 443
root      7063  0.0  0.0  35340  1328 ?        Sl   Mar20   0:00  \_ /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 80 -container-ip 172.17.0.4 -container-port 80

The docker container is stopped:

2017-03-23T16:41:11.248008092Z container kill 7ea1fef33d5e745d7e9c7e8b4c5dae3e0a9493518e6cf810a74f83b51fef5a13 (image=kontena/lb:latest, io.kontena.container.deploy_rev=2017-03-20 22:17:00 UTC, io.kontena.container.id=58d054ddec497a0006000393, io.kontena.container.name=kontena-cloud-internet-lb-1, io.kontena.container.overlay_cidr=10.81.128.16/16, io.kontena.container.overlay_network=kontena, io.kontena.container.pod=kontena-cloud-internet-lb-1, io.kontena.container.service_revision=4, io.kontena.container.type=container, io.kontena.grid.name=account, io.kontena.health_check.initial_delay=10, io.kontena.health_check.interval=60, io.kontena.health_check.port=80, io.kontena.health_check.protocol=http, io.kontena.health_check.timeout=10, io.kontena.health_check.uri=/__health, io.kontena.service.id=57ee4e29fb9e240007001437, io.kontena.service.instance_number=1, io.kontena.service.name=kontena-cloud-internet-lb, io.kontena.stack.name=null, name=kontena-cloud-internet-lb-1, signal=15)
2017-03-23T16:41:11.272829538Z container die 7ea1fef33d5e745d7e9c7e8b4c5dae3e0a9493518e6cf810a74f83b51fef5a13 (exitCode=0, image=kontena/lb:latest, io.kontena.container.deploy_rev=2017-03-20 22:17:00 UTC, io.kontena.container.id=58d054ddec497a0006000393, io.kontena.container.name=kontena-cloud-internet-lb-1, io.kontena.container.overlay_cidr=10.81.128.16/16, io.kontena.container.overlay_network=kontena, io.kontena.container.pod=kontena-cloud-internet-lb-1, io.kontena.container.service_revision=4, io.kontena.container.type=container, io.kontena.grid.name=account, io.kontena.health_check.initial_delay=10, io.kontena.health_check.interval=60, io.kontena.health_check.port=80, io.kontena.health_check.protocol=http, io.kontena.health_check.timeout=10, io.kontena.health_check.uri=/__health, io.kontena.service.id=57ee4e29fb9e240007001437, io.kontena.service.instance_number=1, io.kontena.service.name=kontena-cloud-internet-lb, io.kontena.stack.name=null, name=kontena-cloud-internet-lb-1)
2017-03-23T16:41:11.620252349Z container stop 7ea1fef33d5e745d7e9c7e8b4c5dae3e0a9493518e6cf810a74f83b51fef5a13 (image=kontena/lb:latest, io.kontena.container.deploy_rev=2017-03-20 22:17:00 UTC, io.kontena.container.id=58d054ddec497a0006000393, io.kontena.container.name=kontena-cloud-internet-lb-1, io.kontena.container.overlay_cidr=10.81.128.16/16, io.kontena.container.overlay_network=kontena, io.kontena.container.pod=kontena-cloud-internet-lb-1, io.kontena.container.service_revision=4, io.kontena.container.type=container, io.kontena.grid.name=account, io.kontena.health_check.initial_delay=10, io.kontena.health_check.interval=60, io.kontena.health_check.port=80, io.kontena.health_check.protocol=http, io.kontena.health_check.timeout=10, io.kontena.health_check.uri=/__health, io.kontena.service.id=57ee4e29fb9e240007001437, io.kontena.service.instance_number=1, io.kontena.service.name=kontena-cloud-internet-lb, io.kontena.stack.name=null, name=kontena-cloud-internet-lb-1)

All remaining docker-proxy processes exit... and the zombie weave veths disappear

[LINK]Deleted 109: vethwepl1541@NONE: <BROADCAST,MULTICAST> mtu 1410 qdisc noop state DOWN group default 
[LINK]Deleted 45: vethwepl1704@NONE: <BROADCAST,MULTICAST> mtu 1410 qdisc noop state DOWN group default 
[LINK]Deleted 51: vethwepl1907@NONE: <BROADCAST,MULTICAST> mtu 1410 qdisc noop state DOWN group default 
[LINK]Deleted 53: vethwepl1913@NONE: <BROADCAST,MULTICAST> mtu 1410 qdisc noop state DOWN group default 
[LINK]Deleted 63: vethwepl1966@NONE: <BROADCAST,MULTICAST> mtu 1410 qdisc noop state DOWN group default 
[LINK]Deleted 125: vethwepl2353@NONE: <BROADCAST,MULTICAST> mtu 1410 qdisc noop state DOWN group default 
[LINK]Deleted 129: vethwepl5101@NONE: <BROADCAST,MULTICAST> mtu 1410 qdisc noop state DOWN group default 
[LINK]Deleted 134: veth9905923@NONE: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default 
[LINK]Deleted 135: vethf8cbeef@NONE: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default 
[LINK]Deleted 137: vethwepl7082@NONE: <BROADCAST,MULTICAST> mtu 1410 qdisc noop state DOWN group default 

Using some kernel kprobe_events, it looks like one of the docker-proxy processes is holding on to netns references via netlink sockets, which get released once the process exits:

    docker-proxy-7109  [000] d... 76633492.785801: p_netlink_release_0: (netlink_release+0x0/0x340)
    docker-proxy-7109  [000] d... 76633492.785863: p_netlink_release_0: (netlink_release+0x0/0x340)
    docker-proxy-7109  [000] d... 76633492.785888: p_netlink_release_0: (netlink_release+0x0/0x340)
    docker-proxy-7109  [000] d... 76633492.785991: p_netlink_release_0: (netlink_release+0x0/0x340)
    docker-proxy-7109  [000] d... 76633492.786027: p_netlink_release_0: (netlink_release+0x0/0x340)
    docker-proxy-7109  [000] d... 76633492.786046: p_netlink_release_0: (netlink_release+0x0/0x340)
    docker-proxy-7109  [000] d... 76633492.786050: p_netlink_release_0: (netlink_release+0x0/0x340)

 => netlink_release
 => sock_close
 => __fput
 => ____fput
 => task_work_run
 => do_exit
 => do_group_exit
 => SyS_exit_group
 => do_syscall_64
 => return_from_SYSCALL_64

This immediately results in multiple network namespaces being released:

          <idle>-0     [000] d.s. 76633492.789396: p___put_net_0: (__put_net+0x0/0x70)
     ksoftirqd/0-3     [000] d.s. 76633492.789485: p___put_net_0: (__put_net+0x0/0x70)
     ksoftirqd/0-3     [000] d.s. 76633492.789511: p___put_net_0: (__put_net+0x0/0x70)
     ksoftirqd/0-3     [000] d.s. 76633492.789521: p___put_net_0: (__put_net+0x0/0x70)
     ksoftirqd/0-3     [000] d.s. 76633492.789524: p___put_net_0: (__put_net+0x0/0x70)
     ksoftirqd/0-3     [000] d.s. 76633492.789525: p___put_net_0: (__put_net+0x0/0x70)
     ksoftirqd/0-3     [000] d.s. 76633492.789527: p___put_net_0: (__put_net+0x0/0x70)
          <idle>-0     [000] d.s. 76633495.070476: p___put_net_0: (__put_net+0x0/0x70)

 => __put_net
 => sk_destruct
 => __sk_free
 => sk_free
 => deferred_put_nlk_sk
 => rcu_process_callbacks
 => ... various code paths

These result in cleanup_net calls that destroy the zombie veths:

   kworker/u30:1-5984  [000] d... 76633492.789433: p_cleanup_net_0: (cleanup_net+0x0/0x2b0)
   kworker/u30:1-5984  [000] d... 76633492.789446: <stack trace>
 => cleanup_net
 => worker_thread
 => kthread
 => ret_from_fork

   kworker/u30:1-5984  [000] d... 76633492.792408: p_unregister_netdevice_queue_0: (unregister_netdevice_queue+0x0/0xc0)
   kworker/u30:1-5984  [000] d... 76633492.792423: <stack trace>
 => unregister_netdevice_queue
 => default_device_exit_batch
 => ops_exit_list.isra.4
 => cleanup_net
 => [unknown/kretprobe'd]
 => worker_thread
 => kthread
 => ret_from_fork

...

   kworker/u30:1-5984  [000] d... 76633492.792443: p_netif_carrier_off_0: (netif_carrier_off+0x0/0x30)
   kworker/u30:1-5984  [000] d... 76633492.792444: <stack trace>
 => netif_carrier_off
 => __dev_close_many
 => dev_close_many
 => rollback_registered_many
 => unregister_netdevice_many
 => default_device_exit_batch
 => ops_exit_list.isra.4
 => cleanup_net
 => [unknown/kretprobe'd]
 => worker_thread
 => kthread
 => ret_from_fork

Other curiosities

It looks like an ARP timeout can also hold on to a netns reference:

          <idle>-0     [000] d.s. 76633495.070476: p___put_net_0: (__put_net+0x0/0x70)
          <idle>-0     [000] d.s. 76633495.070495: <stack trace>
 => __put_net
 => sk_destruct
 => __sk_free
 => sk_free
 => tcp_wfree
 => skb_release_head_state
 => skb_release_all
 => kfree_skb
 => arp_error_report
 => neigh_invalidate
 => neigh_timer_handler
 => call_timer_fn
 => run_timer_softirq
 => __do_softirq
 => irq_exit
 => xen_evtchn_do_upcall
 => xen_hvm_callback_vector
 => default_idle
 => arch_cpu_idle
 => default_idle_call
 => cpu_startup_entry
 => rest_init
 => start_kernel
 => x86_64_start_reservations
 => x86_64_start_kernel
SpComb commented 7 years ago

Confirmed that the dockerd process opens a netlink socket for each container network sandbox, and holds onto it until the container is stopped and the network sandbox is destroyed. When the same dockerd process spawns a new docker-proxy process for a different container, this netlink socket appears to get leaked to the docker-proxy process. When the original container is stopped by Docker, this unrelated docker-proxy process still holds a reference to the original container's network namespace via this leaked netlink socket, and the original container's leaked netns (and any associated interfaces) does not destroyed until this docker-proxy process exits when this other container is stopped by Docker.

Traces

https://gist.github.com/SpComb/197cf7c4191dcca261fdfeade0be3c54

When starting a new Docker container, the main Docker process appears to execute the libnetwork:osl.GetSandboxForExternalKey function, called via the sandbox_externalkey_unix mechanism: [1]

2489  11:13:36 accept4(11,  {sa_family=AF_LOCAL, NULL}, [2], SOCK_CLOEXEC|SOCK_NONBLOCK) = 52
2489  11:13:36 read(52,  "{\"ContainerID\":\"bdf2e17d74bcc83a6e28785394ca9744afc214c7dab1ce33ac0401d7619df660\",\"Key\":\"/proc/5428/ns/net\"}", 1280) = 108

This does createNamespaceFile: [2]

2489  11:13:36 stat("/var/run/docker/netns/c55b2525ccb9", 0xc820e40788) = -1 ENOENT (No such file or directory)
2489  11:13:36 openat(AT_FDCWD, "/var/run/docker/netns/c55b2525ccb9", O_RDWR|O_CREAT|O_TRUNC|O_CLOEXEC, 0666)   = 61
2489  11:13:36 close(61) = 0

And then mountNetworkNamespace: [3]

2489  11:13:36 mount("/proc/5428/ns/net", "/var/run/docker/netns/c55b2525ccb9", 0xc822434cd8, MS_BIND, NULL) = 0

And then netns.GetFromPath: [4]

2489  11:13:36 openat(AT_FDCWD, "/var/run/docker/netns/c55b2525ccb9", O_RDONLY ) = 61

It then runs netlink.NewHandleAt -> nl.GetSocketAt: [5]

2489  11:13:36 getpid()                 = 1493
2489  11:13:36 gettid()                 = 2489
2489  11:13:36 openat(AT_FDCWD, "/proc/1493/task/2489/ns/net", O_RDONLY)   = 73
2489  11:13:36 setns(61, CLONE_NEWNET)  = 0
2489  11:13:36 socket(PF_NETLINK, SOCK_RAW, NETLINK_ROUTE)   = 84
2489  11:13:36 bind(84, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12)     = 0
2489  11:13:36 setns(73, CLONE_NEWNET )    = 0
2489  11:13:36 close(73) = 0

This leaves the main Docker process with an fd=84 netlink socket open, associated with the container network namespace.

When a Docker container with published ports gets restarted, the main Docker process forks and execs a new docker-proxy process: [6]:

1824  11:14:02 clone(child_stack=0, flags=SIGCHLD) = 5965
5965  11:14:02 execve("/usr/bin/docker-proxy", ["/usr/bin/docker-proxy", "-proto", "udp", "-host-ip", "0.0.0.0", "-host-port", "1194", "-container-ip", "172.17.0.1", "-container-port", "1194"], [/* 12 vars */]) = 0

Later, the dockerd process's netlink socket is closed when the original container is stopped: [8]

3640  11:14:22 close(84)    = 0
3640  11:14:22 umount("/var/run/docker/netns/c55b2525ccb9", MNT_DETACH) = 0

But the previously forked docker-proxy process process has inherited this fd=84 netlink socket: [7]:

docker-pr 5965 root 84u sock 0,8 0t0 350413 protocol: NETLINK

The docker-proxy process does not know anything about this netlink socket, but having inherited it from the parent dockerd process, it is enough to keep the network namespace alive. See the previous comment for an analysis of what happens when the docker-proxy process exits.