erebe / wstunnel

Tunnel all your traffic over Websocket or HTTP2 - Bypass firewalls/DPI - Static binary available
Other
3.22k stars 290 forks source link

Network performance decrease with latency : no tcp window scaling #80

Closed emachabe closed 2 years ago

emachabe commented 2 years ago

Hi Romain,

I am facing a throughput decrease while connecting remotely (ie: with latency higher than 1ms) If I do a test locally I can reach arround 60MB/s If I do a test remotely on a link that has 12ms latency it drops to 6MB/s

Doing a lot of tcpdump to search for transport issues (there was none) I found out that wstunnel is using small tcp window with small scaling factor. Therefore, having a fixed tcp window size during the transmit can explains why the greater the latency is the smaller the throughput.

I started digging the source code to find about the buffersize (I don't know Haskell, but found it and it looks like you did something that should get it from the system, but changing through sysctl change nothing). I was trying to rebuild from source but got a linking error (I am runing Version 2.7.1 on Oracle Linux8)

(/usr/bin/ld.gold : erreur : -lgmp introuvable .stack/programs/x86_64-linux/ghc-tinfo6-8.8.4/lib/ghc-8.8.4/integer-gmp-1.0.2.0/libHSinteger-gmp-1.0.2.0.a(Type.o):fonctionintegerzmwiredzmin_GHCziIntegerziType_zdwplusBigNatWord_info : erreur : référence à « __gmpn_add_1 » non définie

Do you have any hint about that behaviour. Since I have no error within the TCP capture but only that small window size not growing I think here lies the issue.

Thank you

emachabe commented 2 years ago

The more I dig the more I doubt. :-)

Doing a SCP within the wstunnel: 6MB/s, peak at 6400 pkts/s. Doing a SCP outside the wstunnel: 94MB/s peak at 34000pkts/s

any idea what can cause that rate cap ?

Edit: CPU peak at 9% Tried TCP/UDP, same behaviour

erebe commented 2 years ago

Hello,

Can you provide me pcap cpature of your scp under wstunnel ? and how you start wstunnel client and server.

On my side, doing a raw scp I peak at 29/30 MB/s while with wstunnel I peak at 22MB/s (And I see the window size that is moving)

emachabe commented 2 years ago

To start with a clean base for troubleshooting I give you a pcap of a simple setup using

On the client side the wstunnel is started (with root user for the test) with
./wstunnel -L 127.0.0.1:2222:127.0.0.1:22 ws://10.64.92.118:9090

On the server side: ./wstunnel --server ws://10.64.92.118:9090 -r 127.0.0.1:22

Then a simple SCP from the client to dowload a 112MB file from server to client;

scp -P 2222 root@127.0.0.1:/root/FILE . 100% 112MB 6.6MB/s 00:16

If I do the same, outside the wstunnel scp root@10.64.92.118:/root/FILE . 100% 112MB 94.7MB/s 00:01

The capture is truncated to avoid a too large file. you have server and client capture in the zip file. Capture is set on TCP port 9090 and ICMP (to detect any Type3 message on during the transit). I captured the handshake of the tunnel establishment to allow full analysis of the window scaling mechanism

You can see in the server side capture the [TCP Window full] occuring, capture.zip

emachabe commented 2 years ago

Thank you for your quick reply and time spent on this issue.

I am sorry I did not give you all the details but the longhaul optical fiber is using a reduced MTU due to encryption on transit. There are equipment operating mss clamping to adjust the mss and ICMP Type 3 code 4 is open to flow to allow path MTU discovery. This network is deeply monitored (realtime) and finely tuned to detect/avoid that kind of issues.

As you can see, a SCP copy outside wstunnel is showing nearly line rate throughput. I also tested a wget download using a nginx server at the server and got the same result as SCP.

To avoid things related to MTU I tried with server and client configured with a lower MTU than the transi. The behaviour and throughput are the same as shown in the capture.

Regarding the systctl, I already did a lot of testing before bothering you ;-) included tuning the window scaling feature etc.

That's why I was trying to recompile on Oracle Linux 8 because I supsect something is wrong at the socket layer within wstunnel. Because other programs such as SCP or wget are not experiencing any issues. But I'm stuck at the linking issue.

Thank you

erebe commented 2 years ago

Thanks for the explanation, going to take a deeper look at it during the weekend. Seems to be an interesting issue that you have here ;)

erebe commented 2 years ago

Is it possible for you to do on the server a

perf record --call-graph dwarf -p $(pidof wstunnel)

while doing an scp and give me the record please ?

emachabe commented 2 years ago

Here it is, during the SCP

perf.zip

erebe commented 2 years ago

So the issue was in wstunnel, It seems that setting SO_SNDBUF socket option to configure the size of the send buffer, disable dynamic TCP window auto-scaling :x I had trouble finding information about it, even in the man of setsockopt it was not mentionned.

So in the future if you find an application doing a setsockopt syscall with SO_SNDBUF, you can know that they are broken :(

Can you try the wstunnel from there https://github.com/erebe/wstunnel/releases/download/%2380/wstunnel and let me know if it works better for you ?

P.s: Out of curiosity, in which field are you working on in order to encrypt your network ?

emachabe commented 2 years ago

Hi,

You nailed it quickly. Just look at the result with the new binary : scp -P 2222 root@127.0.0.1:/root/FILE . 100% 112MB 98.8MB/s 00:01

I also tried a transfer over UDP (using a wireguard tunnel) and I can reach a stable 22MB/s. I'm going to play with this tomorrow to see what could be improved.

In both TCP and UDP wstunnel never went above 35% CPU :-)

Thank you. Have a great sunday.

Ps: I'm working in the healthcare software field.

erebe commented 2 years ago

Ok, perfect then :) I am going to make a new release tomorrow. Thanks for reporting the issue, I would have never found it out by myself with my usage :+1:

erebe commented 2 years ago

New release done, https://github.com/erebe/wstunnel/releases/tag/v4.0 ! Thanks for reporting again and enjoy :)

emachabe commented 2 years ago

Thank you !

Have a nice day.