Closed alex1230608 closed 4 years ago
I assume that this is with v0.90, right? Please try with v0.95 instead.
I just tried v0.95. The result is at the end (the network condition changed a little bit, please don't compare the result from the first post and this one)
The throughput of mptcp seems fine if cubic is used. However, I wonder whether it is normal for lia to have lower throughput, and want to ask a high-level question: how to choose between these congestion controls, especially in a DCN scenario? As far as I know from the paper, lia can keep the fairness, and cubic will be almost like creating multiple sockets, which should still be fine if all end servers (in a DCN scenario) are using mptcp + cubic. Am I right?
I also conducted other experiments to measure the throughput vesus time. It turns out mptcp+lia needs a very long time (> 4sec even when RTT < 200us) to climb to the max throughput and the max is too much lower than the link bandwidth (15Gbps vs. 25Gbps). On the other hand, when I use mptcp+reno, it reaches max throughput in milliseconds, but still less than link bandwidth too much (18Gbps vs. 25Gbps). As a baseline, the linux tcp can reach 23Gbps in milliseconds.
with mptcp disabled
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 2.69 GBytes 23.1 Gbits/sec 310 1.04 MBytes
[ 4] 1.00-2.00 sec 2.73 GBytes 23.4 Gbits/sec 71 1.12 MBytes
[ 4] 2.00-3.00 sec 2.72 GBytes 23.3 Gbits/sec 188 1.13 MBytes
[ 4] 3.00-4.00 sec 2.72 GBytes 23.3 Gbits/sec 202 1.11 MBytes
[ 4] 4.00-5.00 sec 2.72 GBytes 23.3 Gbits/sec 103 1.10 MBytes
[ 4] 5.00-6.00 sec 2.71 GBytes 23.3 Gbits/sec 245 1.15 MBytes
[ 4] 6.00-7.00 sec 2.71 GBytes 23.3 Gbits/sec 91 1.11 MBytes
[ 4] 7.00-8.00 sec 2.71 GBytes 23.2 Gbits/sec 242 1.09 MBytes
[ 4] 8.00-9.00 sec 2.72 GBytes 23.4 Gbits/sec 168 1.10 MBytes
[ 4] 9.00-10.00 sec 2.73 GBytes 23.4 Gbits/sec 62 1.15 MBytes
[ 4] 10.00-11.00 sec 2.72 GBytes 23.3 Gbits/sec 97 861 KBytes
with mptcp enabled and using congestion control lia
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 532 MBytes 4.46 Gbits/sec 0 14.1 KBytes
[ 4] 1.00-2.00 sec 1.40 GBytes 12.0 Gbits/sec 0 14.1 KBytes
[ 4] 2.00-3.00 sec 1.77 GBytes 15.2 Gbits/sec 0 14.1 KBytes
[ 4] 3.00-4.00 sec 1.50 GBytes 12.9 Gbits/sec 0 14.1 KBytes
[ 4] 4.00-5.00 sec 1.55 GBytes 13.3 Gbits/sec 0 14.1 KBytes
[ 4] 5.00-6.00 sec 1.86 GBytes 16.0 Gbits/sec 0 14.1 KBytes
[ 4] 6.00-7.00 sec 1.55 GBytes 13.3 Gbits/sec 0 14.1 KBytes
[ 4] 7.00-8.00 sec 1.35 GBytes 11.6 Gbits/sec 0 14.1 KBytes
[ 4] 8.00-9.00 sec 1.45 GBytes 12.5 Gbits/sec 0 14.1 KBytes
[ 4] 9.00-10.00 sec 1.81 GBytes 15.6 Gbits/sec 0 14.1 KBytes
[ 4] 10.00-11.00 sec 1.49 GBytes 12.8 Gbits/sec 0 14.1 KBytes
[ 4] 11.00-12.00 sec 1.30 GBytes 11.2 Gbits/sec 0 14.1 KBytes
[ 4] 12.00-13.00 sec 1.21 GBytes 10.4 Gbits/sec 0 14.1 KBytes
[ 4] 13.00-14.00 sec 1.34 GBytes 11.5 Gbits/sec 0 14.1 KBytes
[ 4] 14.00-15.00 sec 1.72 GBytes 14.8 Gbits/sec 0 14.1 KBytes
with mptcp enabled and using congestion control cubic
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 2.63 GBytes 22.6 Gbits/sec 0 14.1 KBytes
[ 4] 1.00-2.00 sec 2.64 GBytes 22.7 Gbits/sec 0 14.1 KBytes
[ 4] 2.00-3.00 sec 2.65 GBytes 22.8 Gbits/sec 0 14.1 KBytes
[ 4] 3.00-4.00 sec 2.63 GBytes 22.6 Gbits/sec 0 14.1 KBytes
[ 4] 4.00-5.00 sec 2.66 GBytes 22.9 Gbits/sec 0 14.1 KBytes
[ 4] 5.00-6.00 sec 2.65 GBytes 22.8 Gbits/sec 0 14.1 KBytes
[ 4] 6.00-7.00 sec 2.67 GBytes 22.9 Gbits/sec 0 14.1 KBytes
[ 4] 7.00-8.00 sec 2.65 GBytes 22.8 Gbits/sec 0 14.1 KBytes
[ 4] 8.00-9.00 sec 2.66 GBytes 22.9 Gbits/sec 0 14.1 KBytes
[ 4] 9.00-10.00 sec 2.68 GBytes 23.1 Gbits/sec 0 14.1 KBytes
[ 4] 10.00-11.00 sec 2.64 GBytes 22.7 Gbits/sec 0 14.1 KBytes
[ 4] 11.00-12.00 sec 2.68 GBytes 23.0 Gbits/sec 0 14.1 KBytes
Sorry. I accidentally closed the issue..
After applying the core affinity setting in a way similar to the following link, I can achieve 25Gbps with single connection with arbitrary number of subflows and any congestion control. Thank you! The key is to assign all IRQ, RFS for the used NICs to cores in the same CPU socket, in order to avoid cache misses.
Oh, you have several NUMA-nodes? Yeah, if it goes on a different NUMA the perf will go down the drain...
I am using mptcp v0.90. If this issue is expected to disappear, please close this one. I am currently installing the v0.95 version, and will report back if the following is still a problem.
I have noticed several ways to increase the performance of mptcp from other threads, such as disable checksum and try cubic congestion control. However, the throughput of mptcp with 1 subflow is still substantially lower than linux tcp (see results below). If the iperf result is trustworthy, it seems that with mptcp enabled, the cwnd cannot go beyond 14KB and there is no retransmission (i.e., no network congestion). Is the mptcp using a different sending/receiving buffer sizes which limit the max congestion window? Or does anyone know any other possible reasons?
With mptcp disabled net.ipv4.tcp_congestion_control=cubic
With mptcp net.mptcp.mptcp_checksum=0 net.mptcp.mptcp_syn_retries=3 net.ipv4.tcp_congestion_control=cubic net.mptcp.mptcp_path_manager=ndiffports net.mptcp.mptcp_scheduler=default echo 1 > /sys/module/mptcp_ndiffports/parameters/num_subflows
Result: