esnet / iperf

iperf3: A TCP, UDP, and SCTP network bandwidth measurement tool
Other
6.96k stars 1.28k forks source link

"iperf3: interrupt - the server has terminated" immediately upon server start #1016

Open timblaktu opened 4 years ago

timblaktu commented 4 years ago

Context

iperf server running on:

Debian Buster VM running on VMWare vSphere v6.7 iperf 3.6 (cJSON 1.5.2) Linux jenkins-master 4.19.0-8-amd64 #1 SMP Debian 4.19.98-1+deb10u1 (2020-04-27) x86_64 Optional features available: CPU affinity setting, IPv6 flow label, SCTP, TCP congestion algorithm setting, sendfile / zerocopy, socket pacing, authentication

Bug Report

iperf3 daemon logs the following to its --logfile (with --debug and --forceflush) and never logs again.

jenkins@jenkins-master:~$ cat /var/log/iperf.log
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
iperf3: interrupt - the server has terminated

The daemon process still exists after this point:

jenkins@jenkins-master:~$ ps aux | grep iperf
root       1984  0.0  0.0   7752   500 ?        Ss   10:44   0:00 /usr/bin/iperf3 -sD --logfile /var/log/iperf.log --debug --forceflush
jenkins    3343  0.0  0.0   6208   824 pts/0    S+   11:40   0:00 grep iperf

however, it doesn't accept connections using iperf3 clients using UDP/TCP, and a few different ports I have tried (5201, 8000).

jenkins@jenkins-master:~$ sudo cat /etc/init.d/iperf3
#!/bin/bash
### BEGIN INIT INFO
# Provides:          iperf3
# Required-Start:    $local_fs
# Required-Stop:
# Default-Start:     3
# Default-Stop:      0
# Short-Description: run iperf3 at startup so i can always run network perf tests.
### END INIT INFO

# copy me to /etc/init.d and install with sudo update-rc.d iperf3 defaults

case "$1" in
  start)
        logger 'starting iperf3...'
        # run script in background when starting up, so we don't block init
        # -D daemon more does this so we don't have to use &
        /usr/bin/iperf3 -sD --logfile /var/log/iperf.log --debug --forceflush
        ;;
  restart|reload|force-reload|status|stop)
        echo "this command does nothing for iperf3."
        ;;
  *)
        exit 1
        ;;
esac
bmah888 commented 4 years ago

If you run /usr/bin/iperf3 -sD --logfile /var/log/iperf.log --debug --forceflush from a normal login shell, does it work then? Also, is this iperf3 part of the Debian distribution you're using or did you build it from source?

It works for me running from the command line on my CentOS 7 dev VM. So I know this feature isn't totally broken. I'm thinking about what kinds of differences there are between your startup script runtime environment and what I just did from a shell. Haven't thought of anything obvious yet.

timblaktu commented 4 years ago

@bmah888, I disabled the iperf service, rebooted the server, then ran sudo /usr/bin/iperf3 -sD --logfile /var/log/iperf-test.log --debug --forceflush from normal shell (have to be sudo to access log file in /var/log), specifying a new log file, and I do not get the errors in the log. So, perhaps we can assume the server has terminated problem is related to running as a debian service (or perhaps a permissions/context issue)?

However, I still do not get expected behavior connecting from remote clients. When I do iperf3 -c <server_hostname> I get nothing printed to client stdout, and nothing printed to server log.

However, I AM able to get expected behavior running iperf3 client on same machine as server. I assume this means iperf3 server is not binding to correct interface. I see in the man page it says:

       -B, --bind host
              bind to the specific interface associated with address host.  **If the host has
              multiple interfaces, it will use the first interface by default.**

On this server, ip a lists the interfaces in this order:

jenkins@jenkins-master:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:97:df:75 brd ff:ff:ff:ff:ff:ff
    inet 172.16.22.20/24 brd 172.16.22.255 scope global dynamic ens192
       valid_lft 258331sec preferred_lft 258331sec
    inet6 fe80::250:56ff:fe97:df75/64 scope link
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:a2:6a:53:b0 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever

I then killed the daemon and tried same with --bind 172.16.22.20 and confirmed it's now accepting remote connections, and operating as expected.

However, is it expected behavior that in this case it would only bind to localhost by default? I'm pretty sure I had this same configuration working on other machines without having to use the --bind argument. This was on machines running an older version of iperf3, the version in debian stretch repo (3.1.3). Perhaps the default interface binding worked differently in older version. Not a huge deal as we're moving everything to buster, just wanted to provide context.

The version of iperf3 we are using on the machines involved in the OP are from debian buster repository:

jenkins@jenkins-master:~$ sudo apt show iperf3
Package: iperf3
Version: 3.6-2
Priority: optional
Section: net
Maintainer: Roberto Lumbreras <rover@debian.org>
Installed-Size: 56.3 kB
Depends: libc6 (>= 2.11), libiperf0 (>= 3.1.3), libsctp1 (>= 1.0.6.dfsg), libssl1.1 (>= 1.1.0)
Homepage: http://software.es.net/iperf/
Download-Size: 25.9 kB
APT-Manual-Installed: yes
APT-Sources: http://deb.debian.org/debian buster/main amd64 Packages
Description: Internet Protocol bandwidth measuring tool
 Iperf3 is a tool for performing network throughput measurements. It can
 test either TCP or UDP throughput.
 .
 This is a new implementation that shares no code with the original
 iperf from NLANR/DAST and also is not backwards compatible.
 .
 This package contains the command line utility.

I will next experiment with explicitly specifying --bind option in my init script and see if I can get it working properly as a linux service at boot.

timblaktu commented 4 years ago

So far, things appear to be working correctly now that I'm explicitly specifying --bind <ip_address> in my iperf3 service call. Now, starting iperf3 daemon as a Debian service, remote clients are able to connect and run iperf3 tests, without the daemon crashing like before. :-)

Another odd thing I'm noticing, however, is that I'm not seeing verbose output to the specified log file during (or after) running these client tests. The full command my service script is uttering is:

/usr/bin/iperf3 -sD --bind 172.16.22.20 --logfile /var/log/iperf.log --debug --forceflush

I did see this log output when I was running the above command in normal shell instead of as a service. Any ideas on this? Thanks.

bmah888 commented 4 years ago

So it sounds like you're seeing two different problems...one is a permissions issue (probably?) about not being able to write the log file, and another about only being bound to the loopback interface. You're able to work around the second problem but not the first.

Since iperf3 seems to be running fine if it's run manually, I'm going to make the unsurprising guess it has something to do with running as a service. This is a little outside my area (I'm more of a BSD person), and as I might have mentioned upthread, this is not a use case that we exercise at ESnet. I do know that people are running iperf3 as a service (see some other issues here in the issue tracker), so I'm hoping someone knowledgeable sees this and can chime in.