esnet / iperf

iperf3: A TCP, UDP, and SCTP network bandwidth measurement tool
Other
6.95k stars 1.28k forks source link

unable to receive control message: Connection reset by peer #1683

Open nshiveg opened 7 months ago

nshiveg commented 7 months ago

We are constantly getting the connection reset by peer while using the iperf.

below error is coming

iperf 3.5 Linux localhost.localdomain 4.18.0-477.21.1.el8_8.x86_64 #1 SMP Tue Aug 8 21:30:09 UTC 2023 x86_64 iperf3: error - unable to receive control message: Connection reset by peer Control connection MSS 1404 Time: Thu, 18 Apr 2024 07:46:13 GMT Connecting to host 30.10.0.1, port 6002 Cookie: wmo2ueosgoevosxf2iuucucx3kcm3amzager [ 5] local 30.10.0.0 port 45393 connected to 30.10.0.1 port 6002

This is happening predominantly when we trigger it from the SSHLibrary.Write ${command}

class SSHLibrary(object): """SSHLibrary is a Robot Framework test library for SSH and SFTP.

This is a python library..

def write(self, text, loglevel=None):
    """Writes the given ``text`` on the remote machine and appends a newline.

    Appended `newline` can be configured.

    This keyword returns and consumes the written ``text``
    (including the appended newline) from the server output. See the
    `Interactive shells` section for more information.

    The written ``text`` is logged. ``loglevel`` can be used to override
    the default `log level`.

    Example:
    | ${written}=          | `Write`         | su                         |
    | `Should Contain`     | ${written}      | su                         | # Returns the consumed output  |
    | ${output}=           | `Read`          |
    | `Should Not Contain` | ${output}       | ${written}                 | # Was consumed from the output |
    | `Should Contain`     | ${output}       | Password:                  |
    | `Write`              | invalidpasswd   |
    | ${output}=           | `Read`          |
    | `Should Contain`     | ${output}       | su: Authentication failure |

    See also `Write Bare`.
    """
    self._write(text, add_newline=True)
    return self._read_and_log(loglevel, self.current.read_until_newline)
nshiveg commented 7 months ago

NAME="Rocky Linux" VERSION="8.8 (Green Obsidian)" ID="rocky" ID_LIKE="rhel centos fedora" VERSION_ID="8.8" PLATFORM_ID="platform:el8" PRETTY_NAME="Rocky Linux 8.8 (Green Obsidian)" ANSI_COLOR="0;32" LOGO="fedora-logo-icon" CPE_NAME="cpe:/o:rocky:rocky:8:GA" HOME_URL="https://rockylinux.org/" BUG_REPORT_URL="https://bugs.rockylinux.org/" SUPPORT_END="2029-05-31" ROCKY_SUPPORT_PRODUCT="Rocky-Linux-8" ROCKY_SUPPORT_PRODUCT_VERSION="8.8" REDHAT_SUPPORT_PRODUCT="Rocky Linux" REDHAT_SUPPORT_PRODUCT_VERSION="8.8"

My os is above

davidBar-On commented 7 months ago

Few comments/questions:

  1. I assume that all the information given is related to the client. Can you also send the client command used ("iperf3 -c ....")?
  2. Did you try running this client command manually and it succeeded?
  3. The error message "Connection reset by peer" indicates that the server closed the connection. What are the related error/messages the server logged?

(By the way, 3.5 is quite an old version, especially for a build done in Aug. 2023. Is there a reason for not using a newer version?)

nshiveg commented 7 months ago

this is the client command used...

iperf3 -c 30.10.0.1 -i1 -p 5004 -t 80 -b 1m -l 1300 -V -R -u

and server command is this...

[ndacblr@localhost ~]$ iperf3 -s -i 1 -p 5002

Server listening on 5002

Accepted connection from 10.10.31.86, port 42332 [ 5] local 20.10.20.1 port 5002 connected to 10.10.31.86 port 53556 iperf3: error - unable to receive control message: Connection reset by peer

Reason for the 3.5 version

Actually our yum repo brought the 3.5 version of iperf. So it is not manually pulled.

nshiveg commented 7 months ago

We have seen that TCP is working fine. Whereas UDP is failing

davidBar-On commented 7 months ago

Some further clarifications will help:

  1. How is issuing the error "iperf3: error - unable to receive control message: Connection reset by peer"? The Server? Client? Both? In any case, please include the error messages from both sides.
  2. The server command includes "-p 5002" while the client uses "-p 5004", but different ports cannot work. Did you list the correct client and server commands?
  3. Did you try non-reverse UDP test, i.e. without the "-R"? Does it work?
  4. Are both client and server use version 3.5? If you can build iperf3 with newer version it may help, since there were some issues with reverse ("-R") tests that where fixed, although I didn't find a fix that seem to be directly related.
nshiveg commented 7 months ago

this from the client.

[test@localhost ~]$ iperf3 -c 30.10.0.2 -i1 -p 5004 -t 80 -b 1m -l 1300 -V -R -u iperf 3.5 Linux localhost.localdomain 4.18.0-477.21.1.el8_8.x86_64 #1 SMP Tue Aug 8 21:30:09 UTC 2023 x86_64 Control connection MSS 1404 Time: Thu, 18 Apr 2024 16:24:25 GMT Connecting to host 30.10.0.2, port 5004 Reverse mode, remote host 30.10.0.2 is sending Cookie: b2xukwmdwwzqnh7cildedbdxjdd4o4kqzpbg [ 5] local 30.10.0.0 port 44146 connected to 30.10.0.2 port 5004 iperf3: error - unable to receive control message: Connection reset by peer

========================================================== this from the server

iperf3 -s -i 1 -p 5004

Server listening on 5004

Accepted connection from 10.10.31.106, port 41714 [ 5] local 20.10.20.1 port 5004 connected to 10.10.31.106 port 44146 iperf3: error - unable to receive control message: Connection reset by peer

nshiveg commented 7 months ago

this is with the latest version. 3.16

server

/home/ndacblr/iperf-3.16/src/iperf3 -s -i 1 -p 5004 Server listening on 5004 (test #1)

Accepted connection from 10.10.31.118, port 51760 [ 5] local 20.10.20.1 port 5004 connected to 10.10.31.118 port 43847 iperf3: error - unable to receive control message - port may not be available, the other side may have stopped running, etc.: Connection reset by peer

Server listening on 5004 (test #2)

Accepted connection from 10.10.31.118, port 57128 [ 5] local 20.10.20.1 port 5004 connected to 10.10.31.118 port 53114 iperf3: error - unable to receive control message - port may not be available, the other side may have stopped running, etc.: Connection reset by peer

Server listening on 5004 (test #3)

client

[ndacblr@localhost iperf-3.16]$ /home/ndacblr/iperf-3.16/src/iperf3 -c 30.10.0.7 -i1 -p 5004 -t 80 -b 1m -l 1300 -V -R -u iperf 3.16 Linux localhost.localdomain 4.18.0-477.21.1.el8_8.x86_64 #1 SMP Tue Aug 8 21:30:09 UTC 2023 x86_64 Control connection MSS 1404 Time: Fri, 19 Apr 2024 06:50:52 GMT Connecting to host 30.10.0.7, port 5004 Reverse mode, remote host 30.10.0.7 is sending Cookie: eksoo2m2bsvdh7ihfsrbobipzyttlt5eokri Target Bitrate: 1000000 [ 5] local 30.10.0.0 port 53114 connected to 30.10.0.7 port 5004 Starting Test: protocol: UDP, 1 streams, 1300 byte blocks, omitting 0 seconds, 80 second test, tos 0 iperf3: error - unable to send control message - port may not be available, the other side may have stopped running, etc.: Broken pipe

nshiveg commented 7 months ago

What we have observed is first couple of seconds, say for a minute the connection reset comes. But after trying few minute later it works. This behavior is observed typically in UDP.

davidBar-On commented 7 months ago

What we have observed is first couple of seconds, say for a minute the connection reset comes. But after trying few minute later it works.

Do you mean few minutes after executing the server? I.e. that after you execute the server it takes few minutes until you can run a successful test?

Also, does the error happen immediately after the "[ 5] local ..." message is printed, or it takes some time between these two messages?

This behavior is observed typically in UDP.

By "typically" you mean that some times the error happens also for TCP? Also, it will help if you can try the UDP test without the -R. This may help to understand the problem cause.

/home/ndacblr/iperf-3.16/src/iperf3 -s -i 1 -p 5004 .... [ndacblr@localhost iperf-3.16]$ /home/ndacblr/iperf-3.16/src/iperf3 -c 30.10.0.7 -i1 -p 5004 -t 80 -b 1m -l 1300 -V -R -u

Can you run both server and client using -V -d? The additional information may help to understand were the error happened.

Accepted connection from 10.10.31.118, port 51760 [ 5] local 20.10.20.1 port 5004 connected to 10.10.31.118 port 43847 ..... Connecting to host 30.10.0.7, port 5004 Reverse mode, remote host 30.10.0.7 is sending Cookie: eksoo2m2bsvdh7ihfsrbobipzyttlt5eokri Target Bitrate: 1000000 [ 5] local 30.10.0.0 port 53114 connected to 30.10.0.7 port 5004

The client and server are showing different IP addresses, e.g. the server show its address as 20.10.20.1 while the client is connecting to 30.10.0.7. I therefore assume that NAT is performed in between. Can you describe the network between the client and server? Especially, is there a load balancer / proxy between them, e.g. HAProxy?