canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.33k stars 929 forks source link

Lot of packet loss on virtual nics #4836

Closed Nani-o closed 6 years ago

Nani-o commented 6 years ago

I'm experiencing a lot (40%) packet loss on virtual nics plugged into my default bridge.

Required information

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.4 LTS
Release:        16.04
Codename:       xenial

$ uname -s -r -v
Linux 4.4.0-124-generic #148-Ubuntu SMP Wed May 2 13:00:18 UTC 2018

$ lxc --version
3.2

$ lxd --version
3.2

Issue description

I was experiencing a lot of disturbance using my services hosted on my LXD containers. After a long outage where pretty much nothing was working, I began investigating. What I've found is the network between my containers and my host (bridge lxdbr0) was loosing a lot of packets.

Steps to reproduce

It's gonna be difficult to give an exact way to reproduce since I don't know what causes this and have no clue what I've done that could create this behavior.

I can therefore give some informations that I have and update this with asked debug commands.

lxdbr0 bridge
$ lxc network show lxdbr0
config:
  ipv4.address: 192.168.2.1/24
  ipv4.nat: "true"
  ipv6.address: none
description: Default Network
name: lxdbr0
type: bridge
used_by:
[...]
managed: true
status: Created
locations:
- none

$ ip addr show lxdbr0
57: lxdbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovs-system state UP group default qlen 1000
    link/ether 5a:33:27:65:98:4f brd ff:ff:ff:ff:ff:ff
    inet 192.168.2.1/24 scope global lxdbr0
       valid_lft forever preferred_lft forever
    inet6 fe80::1c40:1bff:fe9c:27e/64 scope link
       valid_lft forever preferred_lft forever
example config of one of this containers

Here's the config of one of this containers :

$ lxc config show container
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Ubuntu xenial amd64 (20180725_07:42)
  image.os: Ubuntu
  image.release: xenial
  image.serial: "20180725_07:42"
  volatile.base_image: ababdf07461531ef31d64e9cb17c61edaf42c6bc2959e12d7b97c4d006759a0b
  volatile.eth0.hwaddr: 00:16:3e:21:ec:1d
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.next: '[]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: RUNNING
devices:
  eth0:
    ipv4.address: 192.168.2.4
    nictype: bridged
    parent: lxdbr0
    type: nic
ephemeral: false
profiles:
- default
stateful: false
description: ""

$ lxc profile show default
config:
  environment.http_proxy: ""
  security.privileged: "true"
  user.network_mode: ""
description: Default LXD profile
devices:
  eth0:
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
name: default
used_by:
[...]
packet loss

As you can see, packet loss is only happening on virtual nics that use the bridge :

$ ping 192.168.2.1 -c 10
PING 192.168.2.1 (192.168.2.1) 56(84) bytes of data.
64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=0.070 ms
64 bytes from 192.168.2.1: icmp_seq=2 ttl=64 time=0.059 ms
64 bytes from 192.168.2.1: icmp_seq=3 ttl=64 time=0.061 ms
64 bytes from 192.168.2.1: icmp_seq=4 ttl=64 time=0.062 ms
64 bytes from 192.168.2.1: icmp_seq=5 ttl=64 time=0.048 ms
64 bytes from 192.168.2.1: icmp_seq=6 ttl=64 time=0.067 ms
64 bytes from 192.168.2.1: icmp_seq=7 ttl=64 time=0.075 ms
64 bytes from 192.168.2.1: icmp_seq=8 ttl=64 time=0.062 ms
64 bytes from 192.168.2.1: icmp_seq=9 ttl=64 time=0.062 ms
64 bytes from 192.168.2.1: icmp_seq=10 ttl=64 time=0.060 ms

--- 192.168.2.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 8997ms
rtt min/avg/max/mdev = 0.048/0.062/0.075/0.011 ms

$ ping 192.168.2.2 -c 10
PING 192.168.2.2 (192.168.2.2) 56(84) bytes of data.
64 bytes from 192.168.2.2: icmp_seq=1 ttl=64 time=0.099 ms
64 bytes from 192.168.2.2: icmp_seq=2 ttl=64 time=0.081 ms
64 bytes from 192.168.2.2: icmp_seq=4 ttl=64 time=0.084 ms
64 bytes from 192.168.2.2: icmp_seq=5 ttl=64 time=0.080 ms
64 bytes from 192.168.2.2: icmp_seq=6 ttl=64 time=0.081 ms
64 bytes from 192.168.2.2: icmp_seq=7 ttl=64 time=0.083 ms
64 bytes from 192.168.2.2: icmp_seq=8 ttl=64 time=0.076 ms
64 bytes from 192.168.2.2: icmp_seq=9 ttl=64 time=0.085 ms

--- 192.168.2.2 ping statistics ---
10 packets transmitted, 8 received, 20% packet loss, time 9004ms
rtt min/avg/max/mdev = 0.076/0.083/0.099/0.012 ms

$ ping 192.168.2.3 -c 10
PING 192.168.2.3 (192.168.2.3) 56(84) bytes of data.
64 bytes from 192.168.2.3: icmp_seq=2 ttl=64 time=0.096 ms
64 bytes from 192.168.2.3: icmp_seq=3 ttl=64 time=0.087 ms
64 bytes from 192.168.2.3: icmp_seq=4 ttl=64 time=0.095 ms
64 bytes from 192.168.2.3: icmp_seq=9 ttl=64 time=0.086 ms
64 bytes from 192.168.2.3: icmp_seq=10 ttl=64 time=0.202 ms

--- 192.168.2.3 ping statistics ---
10 packets transmitted, 5 received, 50% packet loss, time 8999ms
rtt min/avg/max/mdev = 0.086/0.113/0.202/0.045 ms

Information to attach

Nani-o commented 6 years ago

I have just restarted the lxd service and the problem has gone away ...

Please tell me if I can still help debugging this with logs or anything, otherwise I don't think of any way for reproducing this so don't hesitate to close. I will make sure not restarting anything if this issue hits me again to help debugging it.

stgraber commented 6 years ago

Would have been interesting to run tcpdump -ni lxdbr0 while the problem was happening to see if it was due to something like a traffic loop or some ARP issues, that'd be something to do should this happen again.

Can you attach your /var/log/kernel.log? Maybe there's still something in there from before the reboot which may help figure this out.

stgraber commented 6 years ago

I'm closing this issue as LXD itself can't cause this behavior, all that LXD does with the network is create the bridge, create veth devices and plug them in. Once the containers are running, LXD isn't involved at all in the data path of packets.

This issue is therefore either something wrong going on with the containers (either maliciously or due to a configuration issue) which either causes a traffic loop, arp flood or arp poisoning or it's a kernel issue in either the bridge driver or the veth driver which causes the packet loss (hopefully in that latter case something may be logged in kernel.log).

Note that the issue being closed doesn't mean we won't be looking at logs and asking more questions to track this down, just that we don't think there's anything actionable in the LXD codebase itself for this issue.

Nani-o commented 6 years ago

Hi !

There is effectively something going on there in my kernel log :

$ grep 'Jul 26' /var/log/kern.log  | wc -l
1067300
$ grep 'Jul 26' /var/log/kern.log | grep lxdbr0 | wc -l
1067223
$ grep 'Jul 26' /var/log/kern.log | grep lxdbr0 | tail -5
Jul 26 13:11:59 sd-107191 kernel: [1416925.737229] device lxdbr0 entered promiscuous mode
Jul 26 13:11:59 sd-107191 kernel: [1416925.739950] device lxdbr0 left promiscuous mode
Jul 26 13:11:59 sd-107191 kernel: [1416925.742396] device lxdbr0 entered promiscuous mode
Jul 26 13:11:59 sd-107191 kernel: [1416925.745885] device lxdbr0 left promiscuous mode
Jul 26 13:11:59 sd-107191 kernel: [1416925.982741] device lxdbr0 entered promiscuous mode

Here's the interesting part (just when I rebooted, there is nothing else after for this day) :

Jul 26 13:11:59 sd-107191 kernel: [1416925.729429] device lxdbr0 entered promiscuous mode
Jul 26 13:11:59 sd-107191 kernel: [1416925.734436] device lxdbr0 left promiscuous mode
Jul 26 13:11:59 sd-107191 kernel: [1416925.737229] device lxdbr0 entered promiscuous mode
Jul 26 13:11:59 sd-107191 kernel: [1416925.739950] device lxdbr0 left promiscuous mode
Jul 26 13:11:59 sd-107191 kernel: [1416925.742396] device lxdbr0 entered promiscuous mode
Jul 26 13:11:59 sd-107191 kernel: [1416925.745885] device lxdbr0 left promiscuous mode
Jul 26 13:11:59 sd-107191 kernel: [1416925.982741] device lxdbr0 entered promiscuous mode
Jul 26 13:12:02 sd-107191 kernel: [1416929.659348] cgroup: new mount options do not match the existing superblock, will be ignored
Jul 26 13:12:02 sd-107191 kernel: [1416929.659812] cgroup: new mount options do not match the existing superblock, will be ignored
Jul 26 13:12:05 sd-107191 kernel: [1416932.685193] audit: type=1400 audit(1532603525.965:1266): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-certbot_</var/snap/lxd/common/lxd>" pid=24099 comm="apparmor_parser"
Jul 26 13:12:06 sd-107191 kernel: [1416932.832706] device veth5U5BWV entered promiscuous mode
Jul 26 13:12:06 sd-107191 kernel: [1416932.834994] IPv6: ADDRCONF(NETDEV_UP): veth5U5BWV: link is not ready
Jul 26 13:12:06 sd-107191 kernel: [1416932.881483] eth0: renamed from veth1X04C7
Jul 26 13:12:06 sd-107191 kernel: [1416932.913211] IPv6: ADDRCONF(NETDEV_CHANGE): veth5U5BWV: link becomes ready
Jul 26 13:12:06 sd-107191 kernel: [1416933.608340] audit: type=1400 audit(1532603526.889:1267): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-hound_</var/snap/lxd/common/lxd>" pid=24180 comm="apparmor_parser"
Jul 26 13:12:06 sd-107191 kernel: [1416933.639974] audit: type=1400 audit(1532603526.921:1268): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxd-certbot_</var/snap/lxd/common/lxd>" name="/" pid=24187 comm="mount" flags="rw, remount"
Jul 26 13:12:07 sd-107191 kernel: [1416933.823469] device vethKRINQW entered promiscuous mode
Jul 26 13:12:07 sd-107191 kernel: [1416933.825778] IPv6: ADDRCONF(NETDEV_UP): vethKRINQW: link is not ready
Jul 26 13:12:07 sd-107191 kernel: [1416933.876633] eth0: renamed from veth00TQ9F
Jul 26 13:12:07 sd-107191 kernel: [1416933.897366] IPv6: ADDRCONF(NETDEV_CHANGE): vethKRINQW: link becomes ready
Jul 26 13:12:08 sd-107191 kernel: [1416934.764246] audit: type=1400 audit(1532603528.045:1269): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-lab-c6_</var/snap/lxd/common/lxd>" pid=24393 comm="apparmor_parser"
Jul 26 13:12:08 sd-107191 kernel: [1416935.173544] device veth4E3TOP entered promiscuous mode
Jul 26 13:12:08 sd-107191 kernel: [1416935.176053] IPv6: ADDRCONF(NETDEV_UP): veth4E3TOP: link is not ready
Jul 26 13:12:08 sd-107191 kernel: [1416935.232658] eth0: renamed from veth0R8A1U
Jul 26 13:12:08 sd-107191 kernel: [1416935.279721] IPv6: ADDRCONF(NETDEV_CHANGE): veth4E3TOP: link becomes ready
Jul 26 13:12:09 sd-107191 kernel: [1416935.944974] audit: type=1400 audit(1532603529.225:1270): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxd-lab-c6_</var/snap/lxd/common/lxd>" name="/proc/" pid=24702 comm="mount" flags="rw, nosuid, nodev, noexec, remount, relatime"
Jul 26 13:12:09 sd-107191 kernel: [1416936.046667] audit: type=1400 audit(1532603529.325:1271): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-lab-c7_</var/snap/lxd/common/lxd>" pid=24727 comm="apparmor_parser"
Jul 26 13:12:09 sd-107191 kernel: [1416936.421435] IPv6: ADDRCONF(NETDEV_CHANGE): vethGPSGE7: link becomes ready
Jul 26 13:12:10 sd-107191 kernel: [1416937.524422] device vethVBLU15 entered promiscuous mode
Jul 26 13:12:10 sd-107191 kernel: [1416937.527090] IPv6: ADDRCONF(NETDEV_UP): vethVBLU15: link is not ready
Jul 26 13:12:10 sd-107191 kernel: [1416937.585444] eth0: renamed from vethFGSDVR
Jul 26 13:12:10 sd-107191 kernel: [1416937.613512] IPv6: ADDRCONF(NETDEV_CHANGE): vethVBLU15: link becomes ready
Jul 26 13:12:12 sd-107191 kernel: [1416938.920474] device veth4RMQRA entered promiscuous mode
Jul 26 13:12:12 sd-107191 kernel: [1416938.923349] IPv6: ADDRCONF(NETDEV_UP): veth4RMQRA: link is not ready
Jul 26 13:12:12 sd-107191 kernel: [1416938.968755] eth0: renamed from vethSICCMO
Jul 26 13:12:12 sd-107191 kernel: [1416939.009935] IPv6: ADDRCONF(NETDEV_CHANGE): veth4RMQRA: link becomes ready
Jul 26 13:12:13 sd-107191 kernel: [1416940.304324] audit: type=1400 audit(1532603533.585:1275): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-lab-d9_</var/snap/lxd/common/lxd>" pid=25613 comm="apparmor_parser"
Jul 26 13:12:13 sd-107191 kernel: [1416940.703690] IPv6: ADDRCONF(NETDEV_UP): vethXDQLOF: link is not ready
Jul 26 13:12:14 sd-107191 kernel: [1416940.795915] IPv6: ADDRCONF(NETDEV_CHANGE): vethXDQLOF: link becomes ready
Jul 26 13:12:14 sd-107191 kernel: [1416940.918267] audit: type=1400 audit(1532603534.197:1278): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxd-lab-d7_</var/snap/lxd/common/lxd>" name="/run/" pid=25990 comm="mount" flags="rw, nosuid, noexec, remount, relatime"
Jul 26 13:12:14 sd-107191 kernel: [1416940.960560] audit: type=1400 audit(1532603534.241:1279): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxd-lab-d7_</var/snap/lxd/common/lxd>" name="/run/lock/" pid=26024 comm="mount" flags="rw, nosuid, nodev, noexec, remount, relatime"
Jul 26 13:12:14 sd-107191 kernel: [1416940.972799] audit: type=1400 audit(1532603534.253:1280): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxd-lab-d7_</var/snap/lxd/common/lxd>" name="/proc/" pid=26034 comm="mount" flags="rw, nosuid, nodev, noexec, remount, relatime"
Jul 26 13:12:15 sd-107191 kernel: [1416941.791473] audit: type=1400 audit(1532603535.073:1284): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-lab-u14_</var/snap/lxd/common/lxd>" pid=26658 comm="apparmor_parser"
Jul 26 13:12:15 sd-107191 kernel: [1416942.120747] device vethUMSDXC entered promiscuous mode
Jul 26 13:12:15 sd-107191 kernel: [1416942.188867] eth0: renamed from vethMMLWR3
Jul 26 13:12:15 sd-107191 kernel: [1416942.203896] IPv6: ADDRCONF(NETDEV_CHANGE): vethUMSDXC: link becomes ready
Jul 26 13:12:17 sd-107191 kernel: [1416943.771558] IPv6: ADDRCONF(NETDEV_UP): veth5RSTHU: link is not ready
Jul 26 13:12:17 sd-107191 kernel: [1416943.836870] eth0: renamed from vethPY0NJ6
Jul 26 13:12:19 sd-107191 kernel: [1416945.881652] audit_printk_skb: 36 callbacks suppressed
Jul 26 13:12:19 sd-107191 kernel: [1416945.881658] audit: type=1400 audit(1532603539.161:1297): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxd-lab-d9_</var/snap/lxd/common/lxd>" name="/" pid=28517 comm="(resolved)" flags="rw, rslave"
Jul 26 13:12:19 sd-107191 kernel: [1416946.293540] audit: type=1400 audit(1532603539.573:1299): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxd-lab-d9_</var/snap/lxd/common/lxd>" name="/" pid=28950 comm="(sh)" flags="rw, rslave"
Jul 26 13:12:20 sd-107191 kernel: [1416946.874322] IPv6: ADDRCONF(NETDEV_UP): vethTJCNR2: link is not ready
Jul 26 13:12:22 sd-107191 kernel: [1416948.937241] audit: type=1400 audit(1532603542.217:1300): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-nginx_</var/snap/lxd/common/lxd>" pid=29870 comm="apparmor_parser"
Jul 26 13:12:24 sd-107191 kernel: [1416951.193685] IPv6: ADDRCONF(NETDEV_CHANGE): veth16YDTF: link becomes ready
Jul 26 13:12:26 sd-107191 kernel: [1416952.997330] device vethAGS1AP entered promiscuous mode
Jul 26 13:12:28 sd-107191 kernel: [1416955.453383] device veth4HFKL2 entered promiscuous mode
Jul 26 13:12:29 sd-107191 kernel: [1416956.695256] audit: type=1400 audit(1532603549.972:1303): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxd-lab-u18_</var/snap/lxd/common/lxd>" name="/" pid=30807 comm="(networkd)" flags="rw, rslave"
Jul 26 13:12:31 sd-107191 kernel: [1416958.255861] audit: type=1400 audit(1532603551.536:1307): apparmor="STATUS" operation="profile_load" label="lxd-lab-u16_</var/snap/lxd/common/lxd>//&:lxd-lab-u16_<var-snap-lxd-common-lxd>://unconfined" name="/sbin/dhclient" pid=30796 comm="apparmor_parser"
Jul 26 13:12:31 sd-107191 kernel: [1416958.315796] audit: type=1400 audit(1532603551.596:1308): apparmor="STATUS" operation="profile_load" label="lxd-lab-u16_</var/snap/lxd/common/lxd>//&:lxd-lab-u16_<var-snap-lxd-common-lxd>://unconfined" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=30796 comm="apparmor_parser"
Jul 26 13:12:31 sd-107191 kernel: [1416958.318482] audit: type=1400 audit(1532603551.596:1310): apparmor="STATUS" operation="profile_load" label="lxd-lab-u16_</var/snap/lxd/common/lxd>//&:lxd-lab-u16_<var-snap-lxd-common-lxd>://unconfined" name="/usr/lib/connman/scripts/dhclient-script" pid=30796 comm="apparmor_parser"

Let me know if I can add more informations.

stgraber commented 6 years ago

The in and out of promiscuous mode is a bit odd unless you were actively running a bunch of tcpdump or using other network monitoring software that would similarly need promiscuous access to the interfaces.

The rest looks about as expected assuming containers were actively started and stopped. It shows new veth devices being created, them joining the bridge, coming up and the container side of them getting renamed to eth0, followed by the container's apparmor profiles getting loaded and some of the apparmor rules getting hit during container startup.

Nani-o commented 6 years ago

I don’t think of anything running on my server that could have the need of promiscuous mode on an interface.

My kern.log is filled from Jul 24 17:50:18 to Jul 26 13:11:59 with in/out promiscuous mode messages. I have seen a lot of disturbance and problems accessing my services from the 25th of July. At that moment I have just restarted what I thought was failed containers and since what I wanted, worked, do not bothered to dig this further. This only yesterday when I saw similar problems that I began to search what could cause this.

The only thing I remember doing on my server these past days, was testing the networks endpoint of LXD API with an ansible module.

stgraber commented 6 years ago

in/out of promisc mode is weird and not something that you should be seeing during normal operation, that is, unless you're running some network monitoring software, intrusion detection system or manual tcpdump/tshark/wireshark sessions that all depend on interface being (if only temporarily) in promisc mode

BartVB commented 5 years ago

Does going In/out of promiscuous mode coincide with starting/stopping containers?

I had similar problems (with LXC, no LXD), this was caused by another container using the same IP address. Also make sure that you are using unique MAC addresses per interface.