Open matttbe opened 8 months ago
I did some tests on my side. I had to use the non-debug mode (otherwise we already increase the tolerance), without KVM.
It looks like the patch from our tree (selftests: mptcp: simult flows: re-adapt BW) helps. When I run the test 100 times:
I tried to get more info in case of issues (without the patch). When there is an issue:
MPTcpExtDuplicateData
on the client side is 8, and MPTcpExtMPTCPRetrans
is at 8 on the server side.unbalanced bwidth - reverse direction transfer slower than expected! runtime 12033 ms, expected 11921 ms max 11921 [ fail ]
If it can be normal to have these counters > 0, maybe we can increase the max value? A technical issue in this test is that we give the timeout to mptcp_connect
(the binary)
If not, here are the packet captures:
Here are the MIB counters:
Counters | OK | Err | Diff |
---|---|---|---|
IpInReceives | 1585 | 867 | -718 |
IpInDelivers | 1585 | 867 | -718 |
IpOutRequests | 1654 | 933 | -721 |
IpOutTransmits | 1654 | 933 | -721 |
TcpActiveOpens | 2 | 2 | 0 |
TcpInSegs | 1585 | 867 | -718 |
TcpOutSegs | 12151 | 12027 | -124 |
Ip6OutRequests | 1 | 1 | 0 |
Ip6OutMcastPkts | 1 | 1 | 0 |
Ip6OutOctets | 56 | 56 | 0 |
Ip6OutMcastOctets | 56 | 56 | 0 |
Ip6OutTransmits | 1 | 1 | 0 |
Icmp6OutMsgs | 1 | 1 | 0 |
Icmp6OutRouterSolicits | 1 | 1 | 0 |
Icmp6OutType133 | 1 | 1 | 0 |
TcpExtTW | 2 | 2 | 0 |
TcpExtDelayedACKLost | 1 | 0 | -1 |
TcpExtTCPPureAcks | 1556 | 838 | -718 |
TcpExtTCPLossProbes | 2 | 2 | 0 |
TcpExtTCPDSACKOldSent | 1 | 0 | -1 |
TcpExtTCPSpuriousRtxHostQueues | 1 | 1 | 0 |
TcpExtTCPOrigDataSent | 12128 | 12004 | -124 |
TcpExtTCPHystartTrainDetect | 1 | 0 | -1 |
TcpExtTCPHystartTrainCwnd | 171 | 0 | -171 |
TcpExtTCPHystartDelayDetect | 1 | 2 | 1 |
TcpExtTCPHystartDelayCwnd | 222 | 724 | 502 |
TcpExtTCPDelivered | 12130 | 12006 | -124 |
IpExtInOctets | 198548 | 137920 | -60628 |
IpExtOutOctets | 16909204 | 16947976 | 38772 |
IpExtInNoECTPkts | 1634 | 906 | -728 |
MPTcpExtMPCapableSYNTX | 1 | 1 | 0 |
MPTcpExtMPCapableSYNACKRX | 1 | 1 | 0 |
MPTcpExtMPTCPRetrans | 0 | 8 | 8 |
MPTcpExtMPJoinSynAckRx | 1 | 1 | 0 |
MPTcpExtOFOQueueTail | 9 | 8 | -1 |
MPTcpExtOFOQueue | 11 | 9 | -2 |
MPTcpExtOFOMerge | 8 | 7 | -1 |
MPTcpExtDuplicateData | 1 | 0 | -1 |
MPTcpExtSndWndShared | 1 | 0 | -1 |
MPTcpExtRcvWndShared | 11 | 7 | -4 |
Time (ms) (limit : 11921) | 11344 | 12033 | 689 |
Counters | Success | Error | Diff |
---|---|---|---|
IpInReceives | 1654 | 933 | -721 |
IpInDelivers | 1654 | 933 | -721 |
IpOutRequests | 1585 | 867 | -718 |
IpOutTransmits | 1585 | 867 | -718 |
TcpPassiveOpens | 2 | 2 | 0 |
TcpInSegs | 1654 | 933 | -721 |
TcpOutSegs | 1633 | 906 | -727 |
TcpRetransSegs | 1 | 0 | -1 |
Ip6OutRequests | 1 | 0 | -1 |
Ip6OutMcastPkts | 1 | 0 | -1 |
Ip6OutOctets | 56 | 0 | -56 |
Ip6OutMcastOctets | 56 | 0 | -56 |
Ip6OutTransmits | 1 | 0 | -1 |
Icmp6OutMsgs | 1 | 0 | -1 |
Icmp6OutRouterSolicits | 1 | 0 | -1 |
Icmp6OutType133 | 1 | 0 | -1 |
TcpExtPAWSEstab | 1 | 0 | -1 |
TcpExtTW | 0 | 1 | 1 |
TcpExtDelayedACKs | 7 | 6 | -1 |
TcpExtTCPPureAcks | 20 | 21 | 1 |
TcpExtTCPLossProbes | 3 | 4 | 1 |
TcpExtTCPDSACKRecv | 1 | 0 | -1 |
TcpExtTCPOrigDataSent | 75 | 66 | -9 |
TcpExtTCPDelivered | 76 | 66 | -10 |
TcpExtTCPDSACKRecvSegs | 1 | 0 | -1 |
IpExtInOctets | 16909204 | 16947976 | 38772 |
IpExtOutOctets | 198548 | 137920 | -60628 |
IpExtInNoECTPkts | 12151 | 12027 | -124 |
MPTcpExtMPCapableSYNRX | 1 | 1 | 0 |
MPTcpExtMPCapableACKRX | 1 | 1 | 0 |
MPTcpExtMPTCPRetrans | 1 | 0 | -1 |
MPTcpExtMPJoinSynRx | 1 | 1 | 0 |
MPTcpExtMPJoinAckRx | 1 | 1 | 0 |
MPTcpExtOFOQueueTail | 849 | 492 | -357 |
MPTcpExtOFOQueue | 868 | 504 | -364 |
MPTcpExtOFOMerge | 503 | 405 | -98 |
MPTcpExtDuplicateData | 8 | 8 | |
MPTcpExtRcvWndShared | 568 | 362 | -206 |
MPTcpExtRcvWndConflictUpdate | 1 | -1 | |
Time (ms) (limit : 11921) | 11344 | 12033 | 689 |
(the TcpIn/OutSegs
are interesting: not the same number of the client/server: issue? GRO?)
Maybe a first step is to send the patch we have in our tree, because it looks like it is already helping? Or we want a better fix with the new BW limits? cc: @pabeni
Note: the same graph as last time, but for the "success" run: (cc: @mjmartineau )
And all the captures I had: simult_flows_analyses.zip (cc: @pabeni )
Thanks for the success chart. In the failure graph it's interesting that the segment lengths settle in to three main groups around the 6 second mark until the throughput drops around 6.75s:
Not sure what it means yet. The success graph maintains the more randomized pattern without the "ceiling" around 34000 for segment length.
FYI, we recently had an issue with unbalanced bwidth with opposed, unbalanced delay
test as well, not just with unbalanced bwidth
:
# 05 unbalanced bwidth 11365 max 11921 [ OK ]
# 06 unbalanced bwidth - reverse direction 11275 max 11921 [ OK ]
# 07 unbalanced bwidth with unbalanced delay 11357 max 11921 [ OK ]
# 08 unbalanced bwidth with unbalanced delay - reverse direction 11358 max 11921 [ OK ]
# 09 unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 11936 ms, expected 11921 ms max 11921 [FAIL]
# client exit code 1, server 0
#
# netns ns3-66599cb3-pXhHQv socket stat for 10009:
# State Recv-Q Send-Q Local Address:Port Peer Address:Port
# TIME-WAIT 0 0 10.0.3.3:10009 10.0.1.1:43328 timer:(timewait,59sec,0)
#
# TIME-WAIT 0 0 10.0.3.3:10009 10.0.2.1:54577 timer:(timewait,59sec,0)
#
#
# netns ns1-66599cb3-pXhHQv socket stat for 10009:
# State Recv-Q Send-Q Local Address:Port Peer Address:Port
# TIME-WAIT 0 0 10.0.2.1%ns1eth2:54577 10.0.3.3:10009 timer:(timewait,59sec,0)
#
# TIME-WAIT 0 0 10.0.1.1:43328 10.0.3.3:10009 timer:(timewait,59sec,0)
#
# -rw------- 1 root root 16777216 May 31 09:49 /tmp/tmp.Kgg7TsylGy
# -rw------- 1 root root 16777216 May 31 09:47 /tmp/tmp.uLdbZsVBoE
# -rw------- 1 root root 81920 May 31 09:47 /tmp/tmp.GFhlAAvlgr
# -rw------- 1 root root 81920 May 31 09:48 /tmp/tmp.yewEKG7Mia
# 10 unbalanced bwidth with opposed, unbalanced delay - reverse direction11682 max 11921 [ OK ]
https://github.com/multipath-tcp/mptcp_net-next/actions/runs/9316053877
The netdev CI also got a similar issue recently (in non-debug):
# 09 unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 11933 ms, expected 11921 ms max 11921 [FAIL]
# client exit code 1, server 0
#
# netns ns3-VPWDdZ socket stat for 10009:
# State Recv-Q Send-Q Local Address:Port Peer Address:Port
# TIME-WAIT 0 0 10.0.3.3:10009 10.0.2.1:49301 timer:(timewait,59sec,0)
#
# TIME-WAIT 0 0 10.0.3.3:10009 10.0.1.1:60652 timer:(timewait,59sec,0)
#
#
# netns ns1-msD7GR socket stat for 10009:
# State Recv-Q Send-Q Local Address:Port Peer Address:Port
# TIME-WAIT 0 0 10.0.2.1%ns1eth2:49301 10.0.3.3:10009 timer:(timewait,59sec,0)
#
# TIME-WAIT 0 0 10.0.1.1:60652 10.0.3.3:10009 timer:(timewait,59sec,0)
#
# -rw------- 1 root root 16777216 Jul 7 11:30 /tmp/tmp.FbUJhacz2Q
# -rw------- 1 root root 16777216 Jul 7 11:28 /tmp/tmp.wwsjKhZcCg
# -rw------- 1 root root 81920 Jul 7 11:30 /tmp/tmp.UktnqZBu4h
# -rw------- 1 root root 81920 Jul 7 11:28 /tmp/tmp.yWNVMvOkTI
# 10 unbalanced bwidth with opposed, unbalanced delay - reverse direction11712 max 11921 [ OK ]
https://netdev-3.bots.linux.dev/vmksft-mptcp/results/672921/3-simult-flows-sh/stdout
Mainly according to NIPA (history normal - debug), this test is unstable. It looks like we can see the instability with and without unbalanced delay, but mainly with.
In NIPA, a lot of tests are executed in parallel. In theory, it should be fine, most selftests don't use all available resources. Still, even if the issue is happening in the non debug mode, this could justify some instabilities.
https://github.com/multipath-tcp/mptcp_net-next/commit/101192474e073b4059f98ddcb81b08d762b8be5e
selftests: mptcp: simult flows: re-adapt BW
patch in our tree could improve the situation, but it has not been properly measured.It would be good to: