abrasive / nxBender

Open source client for netExtender SSL VPNs
BSD 3-Clause "New" or "Revised" License
41 stars 20 forks source link

Limit the size of data packets to less than mtu #15

Open richardash1981 opened 4 years ago

richardash1981 commented 4 years ago

Default the limit to 1280 bytes, which I'm guessing may match the NetExtender client. Allow users to increase or decrease this if necessary for their server and internet path.

I think this may fix #11, it certainly fixes a similar problem for me (triggered by upstream packets in particular).

richardash1981 commented 4 years ago

This commit isn't currently doing what it says above, I've noticed this morning. My local mtu is 1500, and I am running the client with --mtu 1500 so lines of up to 1496 bytes of pppd data are being sent. But by the time you add on TCP and Ethernet headers this means that each write to the socket will be 1500 + at least 20 bytes TCP header + 20 bytes ethernet header and so rather more than 1500 bytes. Yet it works fine for me like this (but doesn't with the original hard-coded 1514 bytes plus same overhead).

The data is sent with the socket's .sendall() call which should send the data in multiple writes to the socket until it is all done. As this raises an exception if the data is not all sent, presumably that doesn't happen (because I don't see an exception being raised that I know of).

So I'm not sure I understand what is going on here. It might be more logical to change the calculation to subtract the ethernet and TCP headers along with the line start/end characters. Also, by calling self.s.getsockopt(socket.SOL_TCP, socket.TCP_MAXSEG) on the TLS socket you can find out what the system thinks the Maximum Segment Size (MSS) of the connection to the server is. On my connection this comes out as 1460, i.e. 1500 minus the two 20-byte headers listed above. So maybe the default should be to make this call and subtract four from it, so that we automatically pump the data over the TLS in lines which fit in a single IP packet.

On the other hand the cause of the loop/hang/dropped connection could be unrelated completely, and this is just hiding it ...

abrasive commented 4 years ago

Thanks, this looks good. By upstream you mean from client to server, right?

I'm pretty sure I tested encapsulated packets up to the original MTU with no problems - but the NX server seems to have evolved quite a bit over the years. I will revisit this against the endpoint I have access to.

On your last comment, methinks there is nothing to worry about. TCP already manages segmentation and it's entirely decoupled from the boundaries of the packets we write to the pipe. Of course it's possible that you do have path MTU issues between you and the server, but it's rather unlikely these days. You can test it with ping -M do if I recall. Also, a write could generate more than 40 bytes of overhead - there is TLS stuff too.

I think the issue is probably just that the NX server is barfing when it gets a piped packet longer than it was designed to handle.

richardash1981 commented 4 years ago

Yes, "upstream" = client to server direction. I did some tests with ping -M do (notes are in my comment on issue #11) and to that server 1500 bytes is the correct MSS, which lead me to jump to conclusions.

I've now got another data point on this, because I also have access to a (newly commissioned) SMA 400 appliance (probably - users of the Mobile Connect client are being told to select SMA 400 as the model when it asks them). When connected to this device Python reports the MSS of the TLS socket as 1260 - no idea why. However connecting with --mtu 1500 (so this is using 1496 byte lines over the TLS) seems to work fine. Can't directly try the same test as I was originally using because this appliance has different servers behind it, but SCP through to mobile data works fine.

Using this SMA 400, the link does have a smaller MTU. The largest IMCP packets which work are ping -M do -s 1272 which is sending 1280 byte payloads (8 bytes IMCP header). Replace the 8 bytes IMCP header with the 20 byte IP header and you get 1260, so it looks like TCP_MAXSEG on the TLS socket does not include the TLS overhead (which is a shame).

But the important outcome is that the behaviour has nothing to do with the link MSS - the TCP stack is successfully fragmenting packets as needed provided we restrict the line length to something the SonicWall server can cope with.

richardash1981 commented 4 years ago

Change the name of the option to reflect what it really does, got rid of the -4 calculation as spurious. Default to 1500 as that works for me on the server where I hit the problem (where 1514 did not). Users are free to tune upwards until they break it!