zehome / MLVPN

Multi-link VPN (ADSL/SDSL/xDSL/Network aggregation / bonding)
http://www.mlvpn.fr/
BSD 2-Clause "Simplified" License
518 stars 127 forks source link

Mlvpn 2.3.1 and packet massive DUP ACK/retransmission #100

Open Exaltia opened 7 years ago

Exaltia commented 7 years ago

Hello, I am using mlvpn 2.3.1, compiled from this repository This is a problem that looks like to exist since i've started to use this version (in other words, it's not something new) but i doubted there was a problem since i added a 3rd link to my mlvpn setup.

Like the title says, i have massive dup ack (up to 40 and probably more dup ack for a single packet) and other problems that make my uploaded overused. The symptom is up to around 1Mb/s tcp ack (and all the related ones) upload for around 10Mb/s download Server and client config are joined. mlvpn server+client conf.txt

I've tested eveything that goes thought my mind on each side of the tunnel. Disabling or enabling the reordering buffer has no effect. Neither has his size change (tried values 16,64,128,256,512,1024), nor mtu modification (i have no problems accessing websites, current used mtu 1362, on both sides, of course) or modifying/disabling latency_increase values Neither any combinaison of such tries changed anything

i have even tried to update to mlvpn version master-14e17e8 but with this version i was never able to make the links work, but, if you say it should have worked (with the same config) i will then open a separate issue.

There is no problem with a tcp file transfert from the router to the computer (used SFTP for the test) I have also tested each of my links outside the mlvpn tunnel, and i have no problems too, the DUP ACK values returns to a really reasonable treshold (like maybe 5% of DUP ACK)

The following tcpdump is from eth0 , where my two dsl modem and my phone is connected. https://drive.google.com/file/d/0B5omNo0ix2cgek5xRGtvUlJFTms/view?usp=sharing And this one is from eth1 side, my local network. https://drive.google.com/file/d/0B5omNo0ix2cgUE5KcVVJaU4xaGM/view?usp=sharing (files are not accepted by github)

On a side note : I use TAP interface because i relay my IPv6 subnet from my server to my home, something that i wasn't able to do with TUN interface type(disabled at the time to be sure to not add a side problem to this one)

Is there something that i have done terribly wrong, is TAP not designed or technically prevents mlvpn to do reordering, or is it a real bug? Thanks in advance for the help (et si besoin, si j'ai pas été claire dans mon explication en anglais, je te la réécrirai en français)

PS : the tcpdump file are intended for wireshark rereading.

Exaltia commented 7 years ago

Another precision, just in case of: Server : Debian 8.7 Kernel : Linux mlvpn 3.16.0-4-amd64 #1 SMP Debian 3.16.39-1 (2016-12-30) x86_64 GNU/Linux Client : Debian 8.7 Kernel : Linux Nozdormu 3.16.0-4-686-pae #1 SMP Debian 3.16.39-1 (2016-12-30) i686 GNU/Linux

Exaltia commented 7 years ago

I've done more investigation during the week-end, up to the network cables. Everythings goes fine if each connection has his own network card. As soon as i plug the dsl modems at the back on the same switch (Note : each modem handle the ppp itself, it's not on the mlvpn computer side), heavy dup ack and retransmission come again. I doubt it is a bug finally. Will try a last switch then put each subnet on his own vlan if nothing betters arise. Feel free to close the issue if it is really something you'r not involved into anymore. Sorry for the burden

markfoodyburton commented 7 years ago

I'm interested in this - it could explain the issues I'm having too (lower bandwidth than I expect). I have not looked for the dup ack's yet - how are you doing this BTW, using wireshark or something else, it would be nice to replicate your results.

Have you played with the re-order buffer? One possibility is that packets are arriving out of order (through one path or the other), and the end points are then re-transmitting. The issue I have is that adding a small reorder buffer doesn't seem to be effective, and a larger one actually seems to have detrimental impact on performance (I ave not yet found a 'sweet spot' in the middle). Maybe your millage varies - I would be interested to know.

markfoodyburton commented 7 years ago

'anecdotally' - seems like my ACK overhead is reduced by increasing the reorder buffer - over the threshold where the buffer is too small, but my bandwidth is slightly negatively impacted....perhaps some sort of throttling effect

Exaltia commented 7 years ago

Yes i'm doing it by using wireshark on the windows computers, and tcpdump on the routers. I don't have the lower bandwith matter with version 2.3.1 (this is not master-14e17e8) I have heavily played with the reorder buffer, on both sides, to crazy values (2017-04-03T09:45:44 [INFO/config] reorder_buffer_size changed from 512000 to 2147483647 (aka max variable value) it changed nothing. I am now sure that, unfortunally, and compared to version version master-14e17e8, reordering is not working on 2.3.1 when all the connexions are behind the same lan card. I think that there is nothing more we can do on 2.3.1, and it's probably not interesting to fix thing in this version. Will open new issue specifically for version master-14e17e8, as i'm sure that reordering works in this version but there is others problems. (one of these is the more reordering buffer you set, the slower the bandwith is)

markfoodyburton commented 7 years ago

Yeah - I've been playing with master (modified a little) I certainly see an effect of the re-order buffer (so long as it's big enough to cope with the traffic I guess), so it needs to be about 32 in my case. However, the effect is minimal, and, as you say, I seem to see a negative impact on bandwidth....

Exaltia commented 6 years ago

(English traduction later, but i'm a bit lazy ATM Ca me prends un peu comme une lubie, j'admet, je suis revenue sur tout ce joyeux problème. j'ai mis le doigt sur un truc nouveau, qui du coup, je pense, exclu totalement mlvpn en tant que problème. en effet, j'ai fait un test iperf en tcp. depuis deux machines, vers ma passerelle. C.F le fichier joint resultats iperf local.txt Je pense qu'on peux du coup fermer en "Not a bug" maintenant. Pour moi, si déjà rien qu'on local j'ai du Retr (Retransmit), du coup, il me semble clair que mlvpn ne pourra jamais s'en tirer correctement.

Mes excuses au final pour tout ce bruit.