multipath-tcp / mptcp

⚠️⚠️⚠️ Deprecated 🚫 Out-of-tree Linux Kernel implementation of MultiPath TCP. 👉 Use https://github.com/multipath-tcp/mptcp_net-next repo instead ⚠️⚠️⚠️
https://github.com/multipath-tcp/mptcp_net-next
Other
889 stars 336 forks source link

Slow (very slow) connection with 3 connections but not with 2 #282

Open dur3x opened 6 years ago

dur3x commented 6 years ago

In the past I had 3 ADSL box by the same internet provider but more recently I replaced one of my adsl box to another one (WAN2) which is by another provider (to have prodiver redundancy and also because the bandwith was better). With the 3 old adsl box by the same ISP I got good result by using the wan1+wan2+wan3 but since I replaced/got a new box (wan2) it's just not usable. So currently I'm running with wan1+wan3 or just wan2. I'm open to any suggestion/test :) I really don't understand why I got these poor results with 3 connections now and not in the past with my old ISP. Thanks in advance for you help dump.pcap.zip

WAN1 (without mptcp)

# curl --interface wan1 http://multipath-tcp.org/snapshots/mptcp_2016_04_18.tar.gz > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 12  120M   12 15.5M    0     0   787k      0  0:02:36  0:00:20  0:02:16  930k^C

WAN2 (without mptcp)

# curl --interface wan2 http://multipath-tcp.org/snapshots/mptcp_2016_04_18.tar.gz > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 34  120M   34 41.5M    0     0  2119k      0  0:00:58  0:00:20  0:00:38  917k

WAN3 (without mptcp)

# curl --interface wan3 http://multipath-tcp.org/snapshots/mptcp_2016_04_18.tar.gz > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 31  120M   31 37.6M    0     0  1895k      0  0:01:05  0:00:20  0:00:45 1947k

WAN1 + WAN3

# curl http://multipath-tcp.org/snapshots/mptcp_2016_04_18.tar.gz > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 14  120M   14 17.9M    0     0   881k      0  0:02:20  0:00:20  0:02:00 1182k

WAN3 + WAN2

# curl http://multipath-tcp.org/snapshots/mptcp_2016_04_18.tar.gz > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 44  120M   44 54.0M    0     0  2727k      0  0:00:45  0:00:20  0:00:25 1900k

WAN1 + WAN2

# curl http://multipath-tcp.org/snapshots/mptcp_2016_04_18.tar.gz > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 10  120M   10 12.8M    0     0   611k      0  0:03:22  0:00:21  0:03:01  738k

WAN1 + WAN2 + WAN3

# curl http://multipath-tcp.org/snapshots/mptcp_2016_04_18.tar.gz > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0  120M    0  996k    0     0  50071      0  0:42:08  0:00:20  0:41:48 13686

WAN1 + WAN2 + WAN3 => with tcpdump pcap in attached to this issue

# curl http://multipath-tcp.org/snapshots/mptcp_2016_04_18.tar.gz > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  1  120M    1 1669k    0     0  79329      0  0:26:36  0:00:21  0:26:15 12882
rstanislav commented 6 years ago

Have you each tested connections rtt ? (ping) and also mptcp capable tests (on mptcp status page)? Difference in ping can cause big impact on results.

dur3x commented 6 years ago

Indeed I don't have all the time stable results (high ping or link disruption) but I didn't think it could have a so big impact (switch from 2Mo/s to around 0Ko/s)

WAN1

--- 8.8.8.8 ping statistics ---
101 packets transmitted, 100 packets received, 0% packet loss
round-trip min/avg/max = 127.688/327.328/1250.665 ms

WAN2

--- 8.8.8.8 ping statistics ---
53 packets transmitted, 53 packets received, 0% packet loss
round-trip min/avg/max = 14.931/30.788/67.409 ms

WAN3

--- 8.8.8.8 ping statistics ---
134 packets transmitted, 134 packets received, 0% packet loss
round-trip min/avg/max = 16.170/19.121/67.605 ms
rstanislav commented 6 years ago

That can cause very huge impact from my experience, on even 2 LTE modems, if 1 gets 20mb/s and second 5-10 but with 3 times more ping, result would be not even close to 20 mb/s.. if connections have around the same RTT(ping) then its working fine.

Also why so big ping on wan1 ADLS ? Looks like problem.. maybe hardware ?

dur3x commented 6 years ago

Thanks for your feedback. In fact I'm not directly connected through these three ADSL with ethernet cable but I'm connected to these ones through wifi. The three ADSL modems are in fact my neighbours which share me an access. The rtt (ping) for each of my connection could be variant but sometimes it's stable and all connections has a similar ping but when I download a file I can confirm that each time all rtt are different on each connection and indeed it can be the root cause.

rstanislav commented 6 years ago

I was doing tests for example currently with 4 LTE modems, installed in vehicle and in city if all 4 have good signal/RTT i'm pushing it to limit (i'm using raspberry pi 3b+) - its connected to router (for wifi) via 100 mb/s link (4 ethernet wires, and from what i found on internet if rpi3b+ connected via 100 mb ethernet 91-92 mb/s is a limit it can push) - and in many cases i can get 60 to 91 mb/s results, i was using 2 sim cards of same operator (megafon) and currently testing with 4 different (replaced 1 megafon with tele2, so in the end i have 4 different operators - MTS, MEGAFON, BEELINE, TELE2) - in most cases results are way lower, but i dont need high speed, i need high coverage and redundancy (outside of city) and tele2 has advantages outside of city or between cities, so as result i get more low, but more stable internet speed - in situations where other 3 operators sometimes have no signal, tele2 works and i have internet, downside of this is speed/quality of tele2 3G/LTE - its poor, so overall speed is way lower in most cases, so yeah, 1 poor link can lead to huge impact in final aggregated speed of connection.. In your case, maybe you can use directional antennas to your neighbours, and that will improve quality ? WiFi is not best way to connect, and in high load situations with poor signal usually results in packet loss witch affect speed greatly.

dur3x commented 6 years ago

I was doing tests for example currently with 4 LTE modems, installed in vehicle and in city if all 4 have good signal/RTT i'm pushing it to limit (i'm using raspberry pi 3b+) - its connected to router (for wifi) via 100 mb/s link (4 ethernet wires, and from what i found on internet if rpi3b+ connected via 100 mb ethernet 91-92 mb/s is a limit it can push) - and in many cases i can get 60 to 91 mb/s results, i was using 2 sim cards of same operator (megafon) and currently testing with 4 different (replaced 1 megafon with tele2, so in the end i have 4 different operators - MTS, MEGAFON, BEELINE, TELE2) - in most cases results are way lower, but i dont need high speed, i need high coverage and redundancy (outside of city) and tele2 has advantages outside of city or between cities, so as result i get more low, but more stable internet speed - in situations where other 3 operators sometimes have no signal, tele2 works and i have internet, downside of this is speed/quality of tele2 3G/LTE - its poor, so overall speed is way lower in most cases, so yeah, 1 poor link can lead to huge impact in final aggregated speed of connection.. In your case, maybe you can use directional antennas to your neighbours, and that will improve quality ? WiFi is not best way to connect, and in high load situations with poor signal usually results in packet loss witch affect speed greatly.

Interesting :-) In the past I already did many tests and never observed this current behaviour. In general in all my test the worst case was that my speed was equal to the worst path/adsl but here as you can see I can see my worst path is around 900Ko/s and during tests with 3 paths in most of case the connection is around 0 and finaly interrupted (voluntary or not). So I'm ok to say that my current setup is not the optimal one and of course improvements can be done but this so bad behaviour looks me too important.

But anyway currently I'm using all the time 2 paths with 1 backup and it's cleary sufficient. I don't know if we will a day get a solution for this because as you said it's perhaps linked to one of the remote component (hardware, provider, signal noisy, ..)

pRiVi commented 6 years ago

I donnot think so, as mentioned on bug #283, a similar setup, shows that the project is not in a usable state, same situation now over years.

suyuan168 commented 5 years ago

I was doing tests for example currently with 4 LTE modems, installed in vehicle and in city if all 4 have good signal/RTT i'm pushing it to limit (i'm using raspberry pi 3b+) - its connected to router (for wifi) via 100 mb/s link (4 ethernet wires, and from what i found on internet if rpi3b+ connected via 100 mb ethernet 91-92 mb/s is a limit it can push) - and in many cases i can get 60 to 91 mb/s results, i was using 2 sim cards of same operator (megafon) and currently testing with 4 different (replaced 1 megafon with tele2, so in the end i have 4 different operators - MTS, MEGAFON, BEELINE, TELE2) - in most cases results are way lower, but i dont need high speed, i need high coverage and redundancy (outside of city) and tele2 has advantages outside of city or between cities, so as result i get more low, but more stable internet speed - in situations where other 3 operators sometimes have no signal, tele2 works and i have internet, downside of this is speed/quality of tele2 3G/LTE - its poor, so overall speed is way lower in most cases, so yeah, 1 poor link can lead to huge impact in final aggregated speed of connection.. In your case, maybe you can use directional antennas to your neighbours, and that will improve quality ? WiFi is not best way to connect, and in high load situations with poor signal usually results in packet loss witch affect speed greatly.

raspberry pi 3b+Isn't the official saying that the network speed can reach 300 mb/s? I said how I studied the half-day speed is always locked at 100 mb / s can not break through 100 mb / s. But unlike the official claims, or where the code locks the NIC speed.

rstanislav commented 5 years ago

I was doing tests for example currently with 4 LTE modems, installed in vehicle and in city if all 4 have good signal/RTT i'm pushing it to limit (i'm using raspberry pi 3b+) - its connected to router (for wifi) via 100 mb/s link (4 ethernet wires, and from what i found on internet if rpi3b+ connected via 100 mb ethernet 91-92 mb/s is a limit it can push) - and in many cases i can get 60 to 91 mb/s results, i was using 2 sim cards of same operator (megafon) and currently testing with 4 different (replaced 1 megafon with tele2, so in the end i have 4 different operators - MTS, MEGAFON, BEELINE, TELE2) - in most cases results are way lower, but i dont need high speed, i need high coverage and redundancy (outside of city) and tele2 has advantages outside of city or between cities, so as result i get more low, but more stable internet speed - in situations where other 3 operators sometimes have no signal, tele2 works and i have internet, downside of this is speed/quality of tele2 3G/LTE - its poor, so overall speed is way lower in most cases, so yeah, 1 poor link can lead to huge impact in final aggregated speed of connection.. In your case, maybe you can use directional antennas to your neighbours, and that will improve quality ? WiFi is not best way to connect, and in high load situations with poor signal usually results in packet loss witch affect speed greatly.

raspberry pi 3b+Isn't the official saying that the network speed can reach 300 mb/s? I said how I studied the half-day speed is always locked at 100 mb / s can not break through 100 mb / s. But unlike the official claims, or where the code locks the NIC speed.

In my case it was wifi router that has 100mb link, so at the end i getting about 93mb/s, in theory using all usb on rpi3b+ i think about 130-150mb/s is possible with gigabit eth link, considering that all usb is using USB hub chip that is shared with rpi ethernet (its also usb to ethernet chip on rpi) and connected to rpi CPU via single usb link).

suyuan168 commented 5 years ago

I was doing tests for example currently with 4 LTE modems, installed in vehicle and in city if all 4 have good signal/RTT i'm pushing it to limit (i'm using raspberry pi 3b+) - its connected to router (for wifi) via 100 mb/s link (4 ethernet wires, and from what i found on internet if rpi3b+ connected via 100 mb ethernet 91-92 mb/s is a limit it can push) - and in many cases i can get 60 to 91 mb/s results, i was using 2 sim cards of same operator (megafon) and currently testing with 4 different (replaced 1 megafon with tele2, so in the end i have 4 different operators - MTS, MEGAFON, BEELINE, TELE2) - in most cases results are way lower, but i dont need high speed, i need high coverage and redundancy (outside of city) and tele2 has advantages outside of city or between cities, so as result i get more low, but more stable internet speed - in situations where other 3 operators sometimes have no signal, tele2 works and i have internet, downside of this is speed/quality of tele2 3G/LTE - its poor, so overall speed is way lower in most cases, so yeah, 1 poor link can lead to huge impact in final aggregated speed of connection.. In your case, maybe you can use directional antennas to your neighbours, and that will improve quality ? WiFi is not best way to connect, and in high load situations with poor signal usually results in packet loss witch affect speed greatly.

raspberry pi 3b+Isn't the official saying that the network speed can reach 300 mb/s? I said how I studied the half-day speed is always locked at 100 mb / s can not break through 100 mb / s. But unlike the official claims, or where the code locks the NIC speed.

In my case it was wifi router that has 100mb link, so at the end i getting about 93mb/s, in theory using all usb on rpi3b+ i think about 130-150mb/s is possible with gigabit eth link, considering that all usb is using USB hub chip that is shared with rpi ethernet (its also usb to ethernet chip on rpi) and connected to rpi CPU via single usb link).

I can't exceed 100mb/s anyway on rpi3b+ anyway. My local computer shows that I am using a 1000m network card to link to rpi3b+. I use rpi3b+ as the openwrt router and the test speed never exceeds 100 MB/s. I think there is a problem. Thank you for your answer.

matttbe commented 5 years ago

Having bad perf might be due to many things: CPU, hardware and bug of course. The best is certainly to analyse traces to know which side is blocking and then analyse what's wrong on this device, e.g. check CPU utilisation, etc.

But now that I see you are using a RPI 3B+, it might be due to this "low-end" device: https://www.raspberrypi.org/forums/viewtopic.php?t=208512

suyuan168 commented 5 years ago

Having bad perf might be due to many things: CPU, hardware and bug of course. The best is certainly to analyse traces to know which side is blocking and then analyse what's wrong on this device, e.g. check CPU utilisation, etc.

But now that I see you are using a RPI 3B+, it might be due to this "low-end" device: https://www.raspberrypi.org/forums/viewtopic.php?t=208512

I just tried my soft routing x86, he has 6 Gigabit Ethernet ports and has usb3.0. I used 6 4G network cards plus 200M fiber. With the ORM test speed is only 110M, the speed is very low. But I removed the 4G And only used 200M fiber. At this time the speed can be full. The maximum speed is 200M and the speed is very fast. So I think MPTCP is very unfriendly for USB 4G. Not only does the speed not improve, but it also drops . The conclusion is that they feel that they will average the network speed. Thank you everyone.

pRiVi commented 5 years ago

No no no!

It is as I have already told in a different bug just got closed without any attention: If you have latency, mptcp fails at all.

They only tested in their labor, without packet loss and without (changing) latency, so you got what they developed: A local-switch only solution.

suyuan168 commented 5 years ago

There may be delays and slow speeds that affect the overall speed. And this speed is not 1+1+1=3 may be 1+1+1=1.5. But I am still very grateful to the MPTCP community team for their contributions. They help the network become more stable. If you can solve this problem, it would be perfect. If the speed is 1+1+1=2.5, it would be great. I am just describing the speed of the network. In the process of moving the network, we don't know which wan's speed will change or not, but we still hope to have a bigger broadband with less delay. Thank you everyone. Hope this question? Can someone think of a better solution. thank you very much.

matttbe commented 5 years ago

Don't hesitate to look at the comments from #334 In short, this use-case should require another MPTCP packet scheduler and it should be needed to analyse traces to understand what's wrong, then analyse why one side doesn't accept more or the other side doesn't push more. Maybe you are "simply" limit by windows size because due to the latency, you might need to buffer more. Some schedulers might use less buffers.

suyuan168 commented 5 years ago

Don't hesitate to look at the comments from #334 In short, this use-case should require another MPTCP packet scheduler and it should be needed to analyse traces to understand what's wrong, then analyse why one side doesn't accept more or the other side doesn't push more. Maybe you are "simply" limit by windows size because due to the latency, you might need to buffer more. Some schedulers might use less buffers. Thank you, Thank you very much, I have used BBR, OLIA, BALIA, WVEGS, no improvement. I will continue to try.

matttbe commented 5 years ago

I don't think TCP CC (net.ipv4.tcp_congestion_control sysctl) will change a lot the situation. MPTCP packet scheduler (net.mptcp.mptcp_scheduler sysctl) might if you have a recent (development version) MPTCP kernel.

suyuan168 commented 5 years ago

I don't think TCP CC (net.ipv4.tcp_congestion_control sysctl) will change a lot the situation. MPTCP packet scheduler (net.mptcp.mptcp_scheduler sysctl) might if you have a recent (development version) MPTCP kernel.

Thank you.

pRiVi commented 5 years ago

There is so much work to be done for this project to be useful in the most use cases, if not in any....

matttbe commented 5 years ago

For those here who are using more than 2 subflows and see issues when one subflow is bad, could you please look at my last message in #334 ?

Of course if you think that this project used by millions of people is not useful, no need to read this message nor testing anything.