esnet / iperf

iperf3: A TCP, UDP, and SCTP network bandwidth measurement tool
Other
6.89k stars 1.27k forks source link

crazy udp output with reverse mode #1044

Open O-ring opened 4 years ago

O-ring commented 4 years ago

Bug Report

I'm getting these crazy output when I run iperf3.9 udp traffic and reverse mode (-R) 10.24.209.252 host is running on a 10Gbit/s NIC, while the 10.68.64.37 is running on a 1Gbit/s NIC

iperf3 -V -c 10.68.64.37 -u -b 800M -t 10 -R
iperf 3.9
Linux FRPARDPADMSVI 2.6.32-754.6.3.el6.x86_64 #1 SMP Tue Sep 18 10:29:08 EDT 2018 x86_64
Control connection MSS 1448
Setting UDP block size to 1448
Time: Mon, 24 Aug 2020 09:54:43 GMT
Connecting to host 10.68.64.37, port 5201
Reverse mode, remote host 10.68.64.37 is sending
      Cookie: sju3p5e5vt3csydm72y6whzxwjewnbnf4al7
      Target Bitrate: 800000000
[  5] local 10.24.209.252 port 36104 connected to 10.68.64.37 port 5201
Starting Test: protocol: UDP, 1 streams, 1448 byte blocks, omitting 0 seconds, 10 second test, tos 0
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  95.4 MBytes   800 Mbits/sec  0.011 ms  2622/71690 (3.7%)  
[  5]   1.00-2.00   sec  94.6 MBytes   793 Mbits/sec  0.013 ms  573/69059 (0.83%)  
[  5]   2.00-3.00   sec  98.0 MBytes   822 Mbits/sec  0.014 ms  -1919/69045 (-2.8%)  
[  5]   3.00-4.00   sec   147 MBytes  1.24 Gbits/sec  0.007 ms  698/69091 (1%)  
[  5]   4.00-5.00   sec   132 MBytes  1.11 Gbits/sec  0.008 ms  -1465/69045 (-2.1%)  
[  5]   5.00-6.00   sec   126 MBytes  1.06 Gbits/sec  0.017 ms  -165/68877 (-0.24%)  
[  5]   6.00-7.00   sec   122 MBytes  1.02 Gbits/sec  0.196 ms  -344/69263 (-0.5%)  
[  5]   7.00-8.00   sec   139 MBytes  1.17 Gbits/sec  0.140 ms  0/68939 (0%)  
[  5]   8.00-9.00   sec   164 MBytes  1.37 Gbits/sec  0.007 ms  0/69173 (0%)  
[  5]   9.00-10.00  sec   167 MBytes  1.40 Gbits/sec  0.026 ms  0/69058 (0%)  
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.07  sec   960 MBytes   800 Mbits/sec  0.000 ms  0/695461 (0%)  sender
[SUM]  0.0-10.1 sec  362892 datagrams received out-of-order
[  5]   0.00-10.00  sec  1.25 GBytes  1.08 Gbits/sec  0.026 ms  0/693240 (0%)  receiver

iperf Done.
iperf3 -V -c 10.68.64.37 -u -b 800M -t 10 -R
iperf 3.9
Linux FRPARDPADMSVI 2.6.32-754.6.3.el6.x86_64 #1 SMP Tue Sep 18 10:29:08 EDT 2018 x86_64
Control connection MSS 1448
Setting UDP block size to 1448
Time: Mon, 24 Aug 2020 09:55:34 GMT
Connecting to host 10.68.64.37, port 5201
Reverse mode, remote host 10.68.64.37 is sending
      Cookie: cz3f4ihzh4zwyussouvlbtyex45nmog42xrv
      Target Bitrate: 800000000
[  5] local 10.24.209.252 port 36804 connected to 10.68.64.37 port 5201
Starting Test: protocol: UDP, 1 streams, 1448 byte blocks, omitting 0 seconds, 10 second test, tos 0
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec   178 MBytes  1.50 Gbits/sec  0.009 ms  0/112027 (0%)  
[  5]   1.00-2.00   sec   189 MBytes  1.59 Gbits/sec  0.018 ms  0/26066 (0%)  
[  5]   2.00-3.00   sec   189 MBytes  1.59 Gbits/sec  0.043 ms  0/69056 (0%)  
[  5]   3.00-4.00   sec   190 MBytes  1.60 Gbits/sec  0.014 ms  0/69054 (0%)  
[  5]   4.00-5.00   sec   189 MBytes  1.59 Gbits/sec  0.014 ms  0/69056 (0%)  
[  5]   5.00-6.00   sec   190 MBytes  1.59 Gbits/sec  0.032 ms  0/69050 (0%)  
[  5]   6.00-7.00   sec   188 MBytes  1.58 Gbits/sec  0.010 ms  0/69086 (0%)  
[  5]   7.00-8.00   sec   188 MBytes  1.58 Gbits/sec  0.012 ms  0/69041 (0%)  
[  5]   8.00-9.00   sec   188 MBytes  1.58 Gbits/sec  0.081 ms  0/68998 (0%)  
[  5]   9.00-10.00  sec   188 MBytes  1.58 Gbits/sec  0.042 ms  0/69117 (0%)  
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.03  sec   957 MBytes   800 Mbits/sec  0.000 ms  0/692797 (0%)  sender
[SUM]  0.0-10.0 sec  787254 datagrams received out-of-order
[  5]   0.00-10.00  sec  1.84 GBytes  1.58 Gbits/sec  0.042 ms  0/690551 (0%)  receiver

iperf Done.

Edit: Reformatted program output for legibility. --bmah.

bmah888 commented 4 years ago

What seems to be happening is that UDP packets are getting received out-of-order. You can see that in the summary statistics. iperf3 was designed to handle this, but there seems to be some kind of pathological condition that causes negative counts. We're attempting to count packet losses by finding gaps in sequence numbers. The problem happens when misordered packets show up to fill those gaps we try to compensate by decrementing the loss count, and something is happening that we don't handle correctly.

I think there's another issue open for this type of problem already, if there is I might close this one and point to the already-existing one instead.

bmah888 commented 4 years ago

914 was the issue I was thinking of, need to ponder how closely these two are related.

bmah888 commented 4 years ago

Er actually another possible explanation came up in the discussion for #914, which is that UDP packets (whether in order or misordered) can be sent in one measurement interval and received in a subsequent interval. In the first program output privded, all of the "packet lost" values per interval, added together, equal 0, which is the total number of packets lost in the summary.

If this is the case, then iperf3 is working, albeit in a somewhat clunky fashion.

O-ring commented 4 years ago

Hello Bruce,

thanks for the explanation. Indeed the following sum:

2622+573-1919+698-1465-165-344 equal to zero.

Question: about the wrong receiver speed reported. Any idea why iperf is reporting a bogus receiver value? The sender is firing UDP datagrams at 800Mbit/s while the receiver is reporting >1Gbit/s. Thanks again for the feedback.

davidBar-On commented 3 years ago

@O-ring, it seems that somehow the UDP packets are duplicated in your system. Note that the packets count is about 69K per second, but that in both cases high number of packets were received out of order (36K packets/sec in the first case and 72K packets/sec in the second case). The extra bandwidth see to come from these out of order packets.

The bytes and packets count are done in iperf_udp_recv(). While each byte received is counted, the number of packets received is determined by the highest packet number received. Therefore, if UDP packets are received more then once (e.g. because they are duplicated by the system), number of bytes received counting will be affected, but number of packets counting will not be affected. The extra (duplicated) packets are counted as out of order packets.