Sandertv / go-raknet

Go library implementing a basic version of the RakNet protocol.
MIT License
106 stars 38 forks source link

Proxy and MTU matching #9

Closed flplv closed 4 years ago

flplv commented 4 years ago

Hello, I believe I found a bug and will submit a patch soon.

The bug is this:

When using the project as proxy the MTU may be different between clients and server, and that will break the raknet protocol, that is MTU sensitive, because packets between server and clients in the example proxy are passed as raw data, they are not assembled, while the MTUs are negotiated between proxy and server and proxy and clients inside the raknet handshake.

The solution I found was to connect to the server first, get the MTU of the server, and disconnect. Wait for client connections but limiting max MTU with client to match MTU of the server, and then when opening the connection to the server after a client connects, limit server MTU with client's. This will cover usecases where either client or server have bigger MTU than the other, by limiting the MTU of the connection to the least of the two sides.

flplv commented 4 years ago

During my tests, I confirmed the bug on the case that server has smaller MTU than client, I did not test the inverse to confirm the bug when the client has smaller MTU than the server. But I assume it would confirm too since server would send a packet that the client cannot decode due to bigger MTU than the one expected.

If you'd like I can share the tcpdumps.

Sandertv commented 4 years ago

I don't think I see the issue here. The raknet client and server establish an MTU size based on the other end and use that MTU size exclusively for that connection. I don't see why that would be a problem, as fragmentation of packets happens on both ends depending on their respective MTU sizes.

flplv commented 4 years ago

Yep, the Raknet handshakes will stabilish the MTU per connection basis in a proxy use case, not end to end.

Take this example:

Assume the MTU are stabilised as follows: proxy to server: MTU is 1300 proxy to client: MTU is 1400

Now when client sends a packet of size 1400 to the proxy, proxy will forward it in raw mode to the server, and it will violate max MTU size of the interface. In tcpdump I observed the following messages:

17:24:34.216571 IP x.x.x.x.dynamic.adsl.gvt.net.br.63328 > instance-1.us-east1-b.c.propane-choir-284403.internal.19132: UDP, bad length 1456 > 1448

It is an UDP stack error, at the interface level, because it received a packet that was bigger than its maximum payload.

The key thing is that the suggested proxy.go implementation does not reassemble Raknet packets, instead, they are forwarded between sides in RAW mode. The fix could be to match MTUs or to do Raknet aware proxy during the payloads, not only the handhakes, but that would be impossible because connection after handshake is encrypted end to end.

Sandertv commented 4 years ago

The thing is that the proxy doesn't forward anything in 'raw' mode. It should all be fragmented where appropriate.

flplv commented 4 years ago

Interesting, let me do some more analysis here.

Sandertv commented 4 years ago

So if there is actually an issue where packets are sent that exceed the MTU size, I think that might be a bug of the actual fragmentation code.

flplv commented 4 years ago

Ok, I thought encryption was at raknet layer, but it seems it is at Minecraft layer.

I could not narrow down the cause of the bug yet, but this is the test I did:

I tested my use case in two manners: having this project as a server and having the Mojang's Dedicated Minecraft Bedrock as a server, in the same cloud machine, in the same interface and port, at different moments, and I observed a difference in the OpenConnetionReply1 packet, from server to client.

Mojang's implementation has reduced down the MTU, while this implementation did not.

Please find attached the two tcpdumps. captures.zip

Sandertv commented 4 years ago

I think the MTU size being 1400 is a different thing altogether. Mojang for some reason caps the MTU size at 1400 server side.

flplv commented 4 years ago

Got it. Yet it is an important data point that when I capped MTU down to 1400 with the fix I made the problem went away. Somehow the root cause must be related to MTU size... I am not sure how much more time I can put in this debug, I will let you know if I find out anything.

Sandertv commented 4 years ago

17:24:34.216571 IP x.x.x.x.dynamic.adsl.gvt.net.br.63328 > instance-1.us-east1-b.c.propane-choir-284403.internal.19132: UDP, bad length 1456 > 1448 This error is worrying though, it might be pointing at some bug in the fragmenting. I'll have a look at that. With this error, do you know what MTU size was established for the connection that this occurred on?

flplv commented 4 years ago

Let me double check.

flplv commented 4 years ago

Humm.. scrolling down my tcpdump capture look what I found:

Screenshot from 2020-07-30 17-46-52

The server is flooding the client with packets of packet size 1474, while the negotiated MTU was 1492.

Screenshot from 2020-07-30 17-48-24

Those packets are definitely too big for the MTU of the interface. Either my interface is reporting bad MTU to the application, or the application is not checking the right MTU of the interface.

Sandertv commented 4 years ago

Perhaps limiting it to 1400 like Mojang does isn't a bad idea. That way the packets probably won't be fragmented like this anymore and VPNs will work properly...

flplv commented 4 years ago

Indeed.

flplv commented 4 years ago

this is my interface MTU:

ens4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1460
Sandertv commented 4 years ago

Thanks a lot for all the debugging. I'm going to dive into this tonight/tomorrow. I'll keep you up to date.

flplv commented 4 years ago

Isn't pk.MTUSize = int16(buf.Len()+1) + 28 computing MTUSize over a packet that was already fragmented and reassembled by the IP layer? Cos the OpenConnectionRequest1 received packet in the server had 1490 bytes, while the MTU of the interface was 1460, and wireshark pointed it as a fragemented IP packet too.

flplv commented 4 years ago

It doesn't make much sense, but this is what I got on my server regarding mtu compuation:

int16(buf.Len()+1) + 28 returns 1492 tcp packet size in wireshark is: 1490 size, with 1456 as payload.

These numbers makes no sense. buf.Len() might not reflect MTU at all...

Sandertv commented 4 years ago

The OpenConnectionRequest1 is padded to be able to get the maximum size of a packet that isn't dropped, but it doesn't provide any way to determine if the packet was fragmented on IP protocol level. The MTU size in that packet therefore isn't exactly right because it is only the biggest size of packets that is not dropped. I'm thinking of adding a 'PreferredMTUSize' field and establish an MTU size based on those.

Sandertv commented 4 years ago

Either that or I might just make the max 1400 like Mojang does.

flplv commented 4 years ago

Ah got it, so the implementation of the Raknet handshake expects packets to be discarded if they exceed the MTU of the path to destination. That is not reliable.

       discovery.  This means the kernel will keep track of the MTU to a
       specific target IP address and return EMSGSIZE when a UDP packet
       write exceeds it.  When this happens, the application should decrease
       the packet size.  Path MTU discovery can be also turned off using the
       IP_MTU_DISCOVER socket option or the
       /proc/sys/net/ipv4/ip_no_pmtu_disc file; see ip(7) for details.  When
       turned off, UDP will fragment outgoing UDP packets that exceed the
       interface MTU.  However, disabling it is not recommended for
       performance and reliability reasons.

https://man7.org/linux/man-pages/man7/udp.7.html

In my system, MTU path detection is disabled, that means IP fragmentation will occur.

Detecting MTU reliably and portably is probably impossible, I'd recommend a hard limit in the MTU as you suggested.

Sandertv commented 4 years ago

Could you check out a2bcbfd65374a507fa72d685baba2bca56c3b7bc and see if it still happens? I've capped the MTU size at 1400 and packet sizes seem to be equal to those sent by the vanilla server.

flplv commented 4 years ago

tested and the patch fixed the issue

Sandertv commented 4 years ago

Okay excellent. I will close this issue and tag a new version.

flplv commented 4 years ago

Thank you my friend

ryanMushy commented 3 years ago

Hey gang, I'm a year late to finding this thread but...

I also ran into mtu max packet size issues while working with raknet. We were sending large hd images via udp and getting severe dropouts. Wireshark showed no extra packet splitting beyond the initial fragmentation to make all packets 1500 bytes. However I also noticed that my packet sizes were not the correct length.

Long story short, when a packet is split the reliability of the packet changes and that affects the packet size, here was my novel fix https://github.com/facebookarchive/RakNet/pull/145