Context

Version of iperf3: 3.9 (rhel), 3.14 (ubuntu), 3.8.1 (darwin)
Hardware: various
Operating system (and distribution, if any): various
Other relevant information: from package managers

Bug Report

The --bitrate option is misleading, or not documented well, depends how you want to classify it. Instead of being maximum target bitrate, it is a long-term averaging target - it can overdrive without a limit until the long-term average has settled. The comment on optional burst rate ("can temporarily exceed the specified bandwidth limit") implies the non-burst version does not temporarily exceed the intended bandwidth.

Actual Behavior

If you have small outages on a network (eg 10 seconds), the bitrate throttle will attempt to catch up on the lost traffic by behaving if no throttle limit exists, driving the traffic as fast as it possibly can until the average bitrate since the start of the matches the long-term target bitrate. This seems to make sense looking at iperf_check_throttle, which calculates the average since the start time.

This doesn't look too exciting on a LAN or high-speed network (maybe a second or so at maximum), but on a slower WAN it may saturate the link for many minutes trying make up for the lost data.

On a LAN, the overshoot looks like a quantisation error - just filling up the congestion window for a short blip.

Unfortunately I only have an example on a LAN for the moment. I can generate a WAN looking example if required.

❯ iperf3-darwin -c beep --bitrate 20M --time 500
Connecting to host beep, port 5201
[  7] local 2403:5801:xxx:x:xxxx:xxxx:xxxx:xxxx port 63316 connected to 2403:5801:xxx:x::x port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd          RTT
[  7]   0.00-1.00   sec  2.50 MBytes  21.0 Mbits/sec    2   1.80 MBytes   4ms     
[  7]   1.00-2.00   sec  2.38 MBytes  19.9 Mbits/sec    3   3.94 MBytes   6ms     
[  7]   2.00-3.00   sec  2.38 MBytes  19.9 Mbits/sec    2   5.90 MBytes   10ms     
[  7]   3.00-4.00   sec  2.38 MBytes  19.9 Mbits/sec    2   7.90 MBytes   8ms     
[  7]   4.00-5.00   sec  2.38 MBytes  19.9 Mbits/sec    1   8.00 MBytes   17ms     
[  7]   5.00-6.00   sec  2.38 MBytes  19.9 Mbits/sec    1   8.00 MBytes   8ms     
[  7]   6.00-7.00   sec   776 KBytes  6.36 Mbits/sec    4   1.16 KBytes   10ms     <--network outage start
[  7]   7.00-8.00   sec  0.00 Byte s  0.00 bits/sec    1   1.16 KBytes   10ms    
[  7]   8.00-9.00   sec  0.00 Byte s  0.00 bits/sec    1   1.39 KBytes   10ms     
[  7]   9.00-10.00  sec  0.00 Byte s  0.00 bits/sec    1   1.39 KBytes   10ms     
[  7]  10.00-11.00  sec  0.00 Byte s  0.00 bits/sec    0   1.39 KBytes   10ms     
[  7]  11.00-12.00  sec  0.00 Byte s  0.00 bits/sec    1   1.39 KBytes   10ms     
[  7]  12.00-13.00  sec  0.00 Byte s  0.00 bits/sec    0   1.39 KBytes   10ms     
[  7]  13.00-14.00  sec  0.00 Byte s  0.00 bits/sec    1   1.39 KBytes   10ms     
[  7]  14.00-15.00  sec  0.00 Byte s  0.00 bits/sec    0   1.39 KBytes   10ms     
[  7]  15.00-16.00  sec  0.00 Byte s  0.00 bits/sec    1   1.39 KBytes   10ms     
[  7]  16.00-17.00  sec  0.00 Byte s  0.00 bits/sec    0   1.39 KBytes   10ms     
[  7]  17.00-18.00  sec  0.00 Byte s  0.00 bits/sec    1   1.39 KBytes   10ms     
[  7]  18.00-19.00  sec  0.00 Byte s  0.00 bits/sec    0   1.39 KBytes   10ms 
[  7]  19.00-20.00  sec  8.37 MBytes  70.2 Mbits/sec   26   1.04 MBytes   20ms     <-- network recovered
[  7]  20.00-21.00  sec  26.7 MBytes   224 Mbits/sec    0   1.08 MBytes   4ms      <-- unbounded overshoot
[  7]  21.00-22.00  sec  2.38 MBytes  19.9 Mbits/sec    0   1.08 MBytes   7ms     <-- settle back to average
[  7]  22.00-23.00  sec  2.38 MBytes  19.9 Mbits/sec    1   1.08 MBytes   8ms     
[  7]  23.00-24.00  sec  2.38 MBytes  19.9 Mbits/sec    3   1.08 MBytes   10ms     
[  7]  24.00-25.00  sec  2.38 MBytes  19.9 Mbits/sec    2   1.09 MBytes   20ms     
[  7]  25.00-26.00  sec  2.38 MBytes  19.9 Mbits/sec    2   1.13 MBytes   9ms     
[  7]  26.00-27.00  sec  2.38 MBytes  19.9 Mbits/sec    3   1.20 MBytes   10ms     
[  7]  27.00-28.00  sec  2.38 MBytes  19.9 Mbits/sec    2   1.28 MBytes   9ms     
[  7]  28.00-29.00  sec  2.38 MBytes  19.9 Mbits/sec    1   1.38 MBytes   7ms     
^C[  7]  29.00-29.28  sec   640 KBytes  18.9 Mbits/sec    0   1.41 MBytes   12ms     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  7]   0.00-29.28  sec  69.8 MBytes  20.0 Mbits/sec   62             sender   <-- correct long-term average.
[  7]   0.00-29.28  sec  0.00 Byte s  0.00 bits/sec                  receiver
iperf3: interrupt - the client has terminated

Steps to Reproduce

cause a short network outage between the iperf client and server
ensure the outage is not directly detectable as link-state drop by the socket on the client or server.
re-establish the network connectivity
watch iperf overshoot unbounded at maximum throughput

Possible Solution

Documentation - ensure --bitrate option is clear and its not a maximum target, but a long-term average target and the system may burst way over the average target if required, even if burst it no specified.
Or use a better algorithm that does some form of closed-loop adaptive rate limiting.

Here's a more degenerative example. A 5Mb/s iperf throttle over a network capable of 20Mb/s with a 16 seconds outage in the middle of the test, which causes iperf to saturate the network for another 10 seconds afterwards.


iperf3-darwin -c 2403:5801:xxx:x::x --bitrate 5M --time 500
Connecting to host 2403:5801:xxx:x::x, port 5201
[  5] local 2403:5801:xxx:xx:xxxx:xxxx:xxxx:xxxx port 63534 connected to 2403:5801:xxx:x::x port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd          RTT
[  5]   0.00-1.00   sec   646 KBytes  5.29 Mbits/sec  121   8.37 KBytes   3ms     
[  5]   1.00-2.00   sec   640 KBytes  5.24 Mbits/sec   50   12.6 KBytes   2ms     
[  5]   2.00-3.00   sec   640 KBytes  5.24 Mbits/sec   59   11.2 KBytes   4ms     
[  5]   3.00-4.00   sec   638 KBytes  5.22 Mbits/sec   41   13.9 KBytes   5ms     
[  5]   4.00-5.00   sec   507 KBytes  4.15 Mbits/sec   49   12.6 KBytes   3ms     
[  5]   5.00-6.00   sec   256 KBytes  2.10 Mbits/sec   13   1.16 KBytes   3ms     <-- start of outage
[  5]   6.00-7.00   sec  0.00 Byte s  0.00 bits/sec    2   1.16 KBytes   3ms     
[  5]   7.00-8.00   sec  0.00 Byte s  0.00 bits/sec    1   1.39 KBytes   3ms     
[  5]   8.00-9.00   sec  0.00 Byte s  0.00 bits/sec    1   1.39 KBytes   3ms     
[  5]   9.00-10.00  sec  0.00 Byte s  0.00 bits/sec    0   1.39 KBytes   3ms     
[  5]  10.00-11.00  sec  0.00 Byte s  0.00 bits/sec    1   1.39 KBytes   3ms     
[  5]  11.00-12.00  sec  0.00 Byte s  0.00 bits/sec    0   1.39 KBytes   3ms     
[  5]  12.00-13.00  sec  0.00 Byte s  0.00 bits/sec    1   1.39 KBytes   3ms     
[  5]  13.00-14.00  sec  0.00 Byte s  0.00 bits/sec    0   1.39 KBytes   3ms     
[  5]  14.00-15.00  sec  0.00 Byte s  0.00 bits/sec    1   1.39 KBytes   3ms     
[  5]  15.00-16.00  sec  0.00 Byte s  0.00 bits/sec    0   1.39 KBytes   3ms     
[  5]  16.00-17.00  sec  0.00 Byte s  0.00 bits/sec    1   1.39 KBytes   3ms     
[  5]  17.00-18.00  sec  0.00 Byte s  0.00 bits/sec    0   1.39 KBytes   3ms     
[  5]  18.00-19.00  sec  0.00 Byte s  0.00 bits/sec    1   1.39 KBytes   3ms     
[  5]  19.00-20.00  sec  0.00 Byte s  0.00 bits/sec    0   1.39 KBytes   3ms     
[  5]  20.00-21.00  sec  0.00 Byte s  0.00 bits/sec    0   1.39 KBytes   3ms     
[  5]  21.00-22.00  sec   696 KBytes  5.70 Mbits/sec   53   11.2 KBytes   5ms     <-- network recovered 
[  5]  22.00-23.00  sec  2.09 MBytes  17.5 Mbits/sec  157   11.2 KBytes   4ms     <-- overshoot to saturation of network
[  5]  23.00-24.00  sec  2.11 MBytes  17.7 Mbits/sec  173   9.76 KBytes   7ms     
[  5]  24.00-25.00  sec  2.12 MBytes  17.8 Mbits/sec  178   11.2 KBytes   3ms     
[  5]  25.00-26.00  sec  1.61 MBytes  13.5 Mbits/sec  119   9.76 KBytes   51ms     
[  5]  26.00-27.00  sec  1.93 MBytes  16.2 Mbits/sec  148   8.37 KBytes   5ms     
[  5]  27.00-28.00  sec  1.65 MBytes  13.8 Mbits/sec  123   13.9 KBytes   3ms     
[  5]  28.00-29.00  sec  1.79 MBytes  15.0 Mbits/sec  141   9.76 KBytes   6ms     
[  5]  29.00-30.00  sec   749 KBytes  6.14 Mbits/sec   65   13.9 KBytes   3ms     <-- recovered the "average", fallback to throttle
[  5]  30.00-31.00  sec   640 KBytes  5.24 Mbits/sec   39   6.97 KBytes   4ms     
[  5]  31.00-32.00  sec   512 KBytes  4.19 Mbits/sec   45   2.79 KBytes   3ms     
[  5]  32.00-33.00  sec   640 KBytes  5.24 Mbits/sec   40   9.76 KBytes   4ms     
[  5]  33.00-34.00  sec   640 KBytes  5.24 Mbits/sec   39   13.9 KBytes   6ms     
[  5]  34.00-35.00  sec   640 KBytes  5.24 Mbits/sec   49   8.37 KBytes   4ms     
[  5]  35.00-36.00  sec   638 KBytes  5.22 Mbits/sec   39   12.6 KBytes   5ms     
[  5]  36.00-37.00  sec   512 KBytes  4.19 Mbits/sec   54   12.6 KBytes   3ms     
[  5]  37.00-38.00  sec   640 KBytes  5.24 Mbits/sec   39   2.79 KBytes   4ms     
^C[  5]  38.00-38.72  sec   512 KBytes  5.81 Mbits/sec   43   11.2 KBytes   4ms     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-38.72  sec  23.2 MBytes  5.03 Mbits/sec  1886             sender
[  5]   0.00-38.72  sec  0.00 Byte s  0.00 bits/sec                  receiver
iperf3: interrupt - the client has terminated

Or use a better algorithm that does some form of closed-loop adaptive rate limiting.

Seem to be a real issue when Cellular/RF networks are used (e.g. a car going into a tunnel for several seconds). I tried to think about what may be such algorithm and came up with the following options:

Calculate the average over each -i reporting interval separately.
Calculate the average over the last n reporting intervals (n will be specified by an iperf3 option).
Start calculating the average when the last n intervals average bit rate is below certain limit.
All the above options, but by seconds instead of reporting intervals.
Use the current algorithm, but limit the maximum bitrate per reporting interval (e.g. in the above results do not send over 10Mbps).

The question is which of the above (or other) options is better for the use case?

The question is which of the above (or other) options is better for the use case?

I think I'm overthinking this, but also consider the following:

We have two controllers in a larger system - a control system (throttle/shaping) which is then flowing into a secondary control system (transport protocol) cascaded in series. In the case of TCP, it has its own complex control system that may interplay with the first control system and create various instabilities. In the case of UDP it's straight through and has no feedback loop. We would need an algorithm that played nicely in all cases.
Is it worth considering more complex systems like a PID controller, or is that way overkill and too complex to tune? This might be nicer for the UDP case where there's no secondary control system to smooth things out. Actually, now that I think about it, there's no plant in between - we just send packets and we measure what we sent, so a complex controller is probably pointless.
What algorithms are normally used in this space?

I had a quick look at common algorithms - most of these are designed for the the two control systems in the other order (traffic generator first, flowing into some type of network throttle afterwards, managing a queue and controlling the output of the queue - eg token bucket, leaky bucket). The algorithm for the exit point of the queue is the part we're interested in, which seem to boil down to some controlled release over a time quanta, which is the options you've got above anyway. My complex and slow way of getting there.

Another thought - what about quantisation? Do we slam the link at 100% until our 1-second amount has been completed, then stop dead until the next second, or something more fine-grained and smoother pacing, or doesn't it matter?

A funny observation while reading up on shaping algorithms - there's a handy tool available called iperf to generate traffic to test your algorithm :)

Another thought - what about quantisation? Do we slam the link at 100% until our 1-second amount has been completed, then stop dead until the next second, or something more fine-grained and smoother pacing, or doesn't it matter?

"quantisation" is already how iperf3 works, using the --pacing-timer value (default is 1000 micro-sec).

I had a quick look at common algorithms .... funny observation while reading up on shaping algorithms ...

It is a good idea to look at these algorithms. I didn't do it before. Reading about the shaping algorithms, they seem to be too complex for iperf3. In addition, your funny observation (which I agree is funny) leads me to believe that actually there is no need for such complex algorithms in iperf3, as it is a tool used for loading the network for such algorithms.

What I think may be good enough and easy enough to be implemented in iperf3 is:

Have an option to limit the maximum bitrate sent. Actually, this option is already supported under Linux for TCP, using --fq-rate, so it will have to be implemented manually for the other cases - basically take it as the maximum rate if the temporary rate required is over the -b value.
Have an option to limit the average bitrate calculation to the last n report intervals. E.g. "1" will means calculation is done for each interval independent on the previous intervals, "2" will take into account only this and the previous intervals, etc.

Option 2 sounds OK. Pretty simple, and it's similar to what's already there, but with limited view into history rather than back to the start. This would mean you could still get small bursts, but at least it's limited to a fraction of a second.

Option 2 with a smaller time quanta becomes an implementation of Option 1. If increments of the pacing timer was used instead of the reporting timer, then the user has full control, but also more confusing for a user to think about and calculate.

Where the average is calculated over multiple intervals, it becomes a moving average which will have a smoothing effect.

A broader question - Is the current -b behaviour an expectation for users that is actively exploited as a feature, or is it considered a bug, or are historical behaviours left as they are to minimise change?

A more disruptive thought - the burst function looks like a token bucket algorithm already, but in the code it looks separately implemented to the throttle (at a quick glance). In theory they could be unified into the one simple algorithm that does both - allocate tokens at a constant rate based on the target throughput rate, collect up to "burst" number of tokens in reserve. Less code, less logic, unified concept, same interface. Disclaimer - I'm talking more than reading code..

This situation is somewhat unusual in the environments for which iperf3 was originally intended (high-speed R&E networks, which tend to have both high bandwidth and high reliability). It's definitely counterintutive. Basically, iperf3 doesn't really know when the network is unavailable or when it's in "catch up" mode with respect to its software pacing.

If you really want to cap the sending rate of the connection so that the sender never, under any circumstances, exceeds some bitrate you specify, then (at least under Linux) you can try using the --fq-rate parameter. This enables policing within the kernel for the connection, and it's applied (as far as I know) to all data sent on the connection, whether it's original data or retransmitted data. This essentially puts a bottleneck on the path.

EDIT: I'm going to advise against trying to add better pacing mechanisms within iperf3. Really the main use case iperf3 is to check end-to-end network and application performance on high-speed R&E networks. In this type of scenario the application-level pacing isn't very useful, and even small code changes can affect the ability of iperf3 to test on high-bandwidth paths (100+ Gbps).

@jgc234, I created a branch in my private iperf3 clone with a suggested fix to the problem you raised: limit the maximum sending bitrate. This is not exactly the approach discussed above, as the average bitrate is still calculated over the full test period. However, it does not allow the sending rate to be over a defined maximum. This change is much simpler than calculating the average bitrate only over the last intervals (would require to keep the last intervals data) and also does not impact performance when the maximum rate is not set.

Is such change good enough? Will you be able to test this version to see that it works o.k. in you environment?

To test, the new 3rd value of the --bitrate should be set to the maximum rate (and it should be the same or larger than the bitrate set in the 1st value). E.g., for testing with bitrate of 5Mbps that should not send over 10Mbps use --bitrate 5M//10M.

esnet / iperf

bandwidth limit overshoot after micro outages #1747

Context

Bug Report

Actual Behavior

Steps to Reproduce

Possible Solution