MTU-related severe performance issues

rlpowell commented 2 years ago

This may be the same thing as https://github.com/rootless-containers/slirp4netns/issues/128 , as that also has a 5 second delay at the beginning, but I can't tell and I have a lot of detail so I didn't want to clutter it up.

Short version: in a rootless podman with default everything (i.e. slirp4netns with default MTU of 65520), curl of a file with a size above the MTU takes 5 seconds when it should be much much less than a second. Reducing the MTU fixes it.

My environment:

$ sudo yum list installed '*podman*' '*slirp*'
[sudo] password for rlpowell:
Installed Packages
libslirp.x86_64                                                                                           4.6.1-2.fc35                                                                                       @fedora
podman.x86_64                                                                                             3:3.4.4-1.fc35                                                                                     @updates
podman-gvproxy.x86_64                                                                                     3:3.4.4-1.fc35                                                                                     @updates
podman-plugins.x86_64                                                                                     3:3.4.4-1.fc35                                                                                     @updates
slirp4netns.x86_64                                                                                        1.1.12-2.fc35                                                                                      @fedora

$ cat /etc/redhat-release
Fedora release 35 (Thirty Five)

Repro:

Dockerfile:

FROM fedora:35

RUN yum -y install netcat time

Run podman build -t slirptest .

In another window on the same host (maybe in a temp dir):

$ dd bs=1024 count=64 if=/dev/zero of=64k_file.bin
64+0 records in
64+0 records out
65536 bytes (66 kB, 64 KiB) copied, 0.000972638 s, 67.4 MB/s
$ dd bs=1024 count=63 if=/dev/zero of=63k_file.bin
63+0 records in
63+0 records out
64512 bytes (65 kB, 63 KiB) copied, 0.000511444 s, 126 MB/s
$ python -m http.server 8081
Serving HTTP on 0.0.0.0 port 8081 (http://0.0.0.0:8081/) ...

In the another window:

$ podman run --rm -it slirptest bash
# time curl [IP of host]:8081/64k_file.bin | wc
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 65536  100 65536    0     0  13108      0  0:00:04  0:00:04 --:--:--  8627
      0       0   65536

real    0m5.012s
user    0m0.013s
sys     0m0.008s

Note that it pauses for about 5 seconds after the first chunk of data.

Then try:

$ podman run --rm -it --net slirp4netns:mtu=1500 slirptest bash
#  time curl 192.168.123.134:8081/64k_file.bin | wc
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 65536  100 65536    0     0  19.3M      0 --:--:-- --:--:-- --:--:-- 31.2M
      0       0   65536

real    0m0.016s
user    0m0.008s
sys     0m0.010s

500x performance difference. :D

Also:

$ podman run --rm -it slirptest bash
# time curl 192.168.123.134:8081/63k_file.bin | wc
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 64512  100 64512    0     0  22.0M      0 --:--:-- --:--:-- --:--:-- 30.7M
      0       0   64512

real    0m0.016s
user    0m0.008s
sys     0m0.010s

So the 64k file causes the problem but the 63k file does not.

In case it's relevant, here's the host's mtu configs:

$ ip addr | grep -i mtu
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
2: enp5s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
3: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br0 state UP group default qlen 1000
4: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000

The communication in question appears, to tcpdump, to come over lo.

My binary search shows that the issue doesn't occur at mtu=48000 and lower, but does occur at mtu=48500 and higher. I have no idea what the significance of that is.

$ podman run --rm -it --net slirp4netns:mtu=48000 slirptest bash
[root@4b1d4880bb30 /]# time curl 192.168.123.134:8081/64k_file.bin | wc
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 65536  100 65536    0     0  21.2M      0 --:--:-- --:--:-- --:--:-- 31.2M
      0       0   65536

real    0m0.016s
user    0m0.010s
sys     0m0.008s
[root@4b1d4880bb30 /]#
exit
$ podman run --rm -it --net slirp4netns:mtu=48500 slirptest bash
[root@99a0585cffb0 /]# time curl 192.168.123.134:8081/64k_file.bin | wc
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 65536  100 65536    0     0  11401      0  0:00:05  0:00:05 --:--:--  7208
      0       0   65536

real    0m5.762s
user    0m0.008s
sys     0m0.013s
[root@99a0585cffb0 /]#

mg90707 commented 2 years ago

I just wanted to add that I have severe performance issues with the default MTU of 65520, too. Running on a Windows Host with a Linux Guest VM (Ubuntu 20.04) and running Iperf3 from a container running on the VM and connecting to the Host: Outside of container: 5 Gbits/s Rootfull container: 5 Gbits/s Rootless container MTU 1500: 1.5 Gbits/s Rootless container MTU 65520: 60 Mbits/s (yes, Megabits)

EDIT: For completeness here the stats when connecting from the host to the container with port-driver slirp4netns. No MTU dependent slowdown here. Outside of container: 3 Gbits/s Rootfull container: 3 Gbits/s Rootless container MTU 1500: 1.6 Gbits/s Rootless container MTU 65520: 1.8 Gbits/s

EDIT again: For the tests I was using Docker Rootless v20.10.12.

srstsavage commented 2 years ago

Can verify the above. In a rootless Docker environment using nginx 1.20 to proxy internal containers, we were seeing many of the requests to nginx take 10+ seconds while requests directly to the proxied services took less than a second. MTU on the Docker daemon was set to 65520. Reducing MTU to 48000 fixed this issue.

Using docker run --sysctl net.ipv4.tcp_rmem="4096 87380 6291456" as suggested in the slirp4netns README did not fix the issue and seemed to have no effect, although it's possible this was user error ¯\(ツ)/¯

Ubuntu 20.04.2 Docker (rootless) 20.10.6 slirp4netns 0.4.3

yaroslavsadin commented 8 months ago

Same issue. Using podman 4.0.2 and slirp4netns 1.1.12 on AlmaLinux 9.0. Though in my case MTU ~20000 is the stable point.

MTU=1500 gives ~1Gbit/s MTU=20000 gives ~9Gbit/s (about the same as --network=host ) MTU=48000 it goes back to ~1Gbit/s MTU=65520 it goes below 1Gbit/s

These are iperf results between the container and another server in the same subnet. --sysctl net.ipv4.tcp_rmem="4096 87380 6291456" didn't help either.

rootless-containers / slirp4netns

MTU-related severe performance issues #284