SpigotMC / BungeeCord

BungeeCord, the 6th in a generation of server portal suites. Efficiently proxies and maintains connections and transport between multiple Minecraft servers.
https://www.spigotmc.org/go/bungeecord
Other
1.56k stars 1.1k forks source link

Packet sending optimization #3392

Open andreasdc opened 2 years ago

andreasdc commented 2 years ago

Feature description

I run Spark and I noticed most of the cpu is being used on sending packets, maybe there are some ways to optimize it, for example like Paper did with their Spigot fork? Something like: https://github.com/PaperMC/Paper/blob/master/patches/server/0732-Allow-controlled-flushing-for-network-manager.patch https://github.com/PaperMC/Paper/blob/master/patches/server/0748-Consolidate-flush-calls-for-entity-tracker-packets.patch https://github.com/PaperMC/Paper/blob/master/patches/server/0757-Optimise-non-flush-packet-sending.patch https://github.com/PaperMC/Paper/blob/master/patches/server/0186-Disable-Explicit-Network-Manager-Flushing.patch It of course depends on what packets are being send and in what way. image

Goal of the feature

Optimized packet sending.

Unfitting alternatives

-

Checking

Outfluencer commented 2 years ago

I think that this is not feasible. BungeeCord directly forwards every single packet, also there is no tick frequency in wich bungeecord could flush unflushed packets. I thought about it and every possible way i found would end with delayed packet flushing. That would increase the ping. In spigot you could queue all packets to be flushed at the next tick or at the end of the current, only issue would be the keep alive and chat packet. we would need to flush them directly as chat is working async without server ticking and keep alive must be exactly as possible or the time in the keep alive need to be rewritet to the time the packet is flushed so we would have the exact time

Outfluencer commented 2 years ago

A possible way to do this with minimal latency increase would be to queue a flush 1ms after a packet is written. if this already is queued another packet would not queue the flush again it would just be flushed with the first written packet that queued the flush.

After the flush happend the next not flushed packet would queue the next flush in 1 ms so the first packet would have a 1ms delay and the packets that are witten after the first would have a delay less than 1ms

Janmm14 commented 2 years ago

I think that this is not feasible. BungeeCord directly forwards every single packet, also there is no tick frequency in wich bungeecord could flush unflushed packets. I thought about it and every possible way i found would end with delayed packet flushing. That would increase the ping. In spigot you could queue all packets to be flushed at the next tick or at the end of the current, only issue would be the keep alive and chat packet. we would need to flush them directly as chat is working async without server ticking and keep alive must be exactly as possible or the time in the keep alive need to be rewritet to the time the packet is flushed so we would have the exact time

You can easily also delay KeepAlive packets, else it'd not show the real delay of the network. The integer in KeepAlive packets does not need to be rewritten. The client has to match what the server sent, if a server switch happens maybe the new server gets an old keepalive packet answer from the previous server; but that problems exists already.

Yes, it'd make sense to still immediately flush chat packets, that difference can be made easily tho.

Maybe something similar to a FlushConsolidationHandler is a better approach?

Outfluencer commented 2 years ago

Yeah we could just add an FlushConsolidationHandler to the end of the Base ChannelInitializer like

ch.pipeline().addFirst(FLUSH_HANDLER, new FlushConsolidationHandler());

but i am not sure what the value explicitFlushAfterFlushes and consolidateWhenNoReadInProgress should be for BungeeCord

Janmm14 commented 2 years ago

The first one would be that we let a flush "through" every x flushes (aka packets) even while we still did not read everything the tcp in has to offer. In general I'd say this needs some testing (CPU usage) and might even depend on the type of server.

The second parameter seems kinda well described in the javadoc. I'd set it to false as paper already does the heavy lifting with timed flushes, but for spigot it might be better if its set to true, that'd needs testing.

Outfluencer commented 2 years ago

I dont understand what happens if the server only sends 1 packet how does it get flushed than if x = 20 and we only flushed one packet

Janmm14 commented 2 years ago

What I understand is that the handler will only delay when there is more stuff waiting to be read.

Thats also a reason I said "something like" said handler, as we need to get the readable state from the OTHER ctx (proxy - client needs state from proxy-server and other way round)

Outfluencer commented 2 years ago

I don’t understand how it can know if there is more to read in the code is nothing that looks like this maybe I just don’t understand but I thought that you can‘t see what will be read next

andreasdc commented 2 years ago

Does bungee need to forward every packet? For example there is huge spam with various packets, even set slot, block changes or inventory packets.

Janmm14 commented 2 years ago

Does bungee need to forward every packet? For example there is huge spam with various packets, even set slot, block changes or inventory packets.

Sorting out individual redundant packets is far more work to do, consumes a bunch of ram and introduces much version-dependant code. Redundant stuff should be sorted out on spigot side.

andreasdc commented 2 years ago

Does bungee need to forward every packet? For example there is huge spam with various packets, even set slot, block changes or inventory packets.

Sorting out individual redundant packets is far more work to do, consumes a bunch of ram and introduces mich version-dependant code. Redundant stuff should be sorted out on spigot side.

The thing is bungee sends packets to players, so you need to send packet twice: on spigot and bungee.

Janmm14 commented 2 years ago

Does bungee need to forward every packet? For example there is huge spam with various packets, even set slot, block changes or inventory packets.

Sorting out individual redundant packets is far more work to do, consumes a bunch of ram and introduces mich version-dependant code. Redundant stuff should be sorted out on spigot side.

The thing is bungee sends packets to players, so you need to send packet twice: on spigot and bungee.

Its just not useful.

andreasdc commented 2 years ago

What do you mean?

Janmm14 commented 2 years ago

What do you mean?

The work which would be needed to not just forward some bytes, but to interpret said bytes, check with previous captured data, updating stuff like whats in which inventory slut currently to be able to not send this couple bytes if for example the stuff in an inventory slot did not change, is far more taxing on ram and cpu than to just forward some bytes. Especially when we introduce something similar to a FlushConsolidationHandler which will reduce the amount of flushes.

@Outfluencer wrote: I don’t understand how it can know if there is more to read in the code is nothing that looks like this maybe I just don’t understand but I thought that you can‘t see what will be read next

The handler uses channelReadComplete, which fires when there is no more data in the underlying tcp buffer to read: https://stackoverflow.com/questions/28473252/how-does-netty-determine-when-a-read-is-complete

andreasdc commented 2 years ago

But bungee sends every packet that the spigot is sending to the player, so you make 2 heavy operations twice, if I'm not wrong. Even if channelReadComplete checks for data in the buffer it happens so often that there is a lot of flushes. Is it possible to send packets directly to the player instead of doing it through bungeecord? Also every ChannelWrapper#write users writeAndFlush method https://github.com/SpigotMC/BungeeCord/blob/8d783aa172830f212baa197b6b0c1c3d5f1237d1/proxy/src/main/java/net/md_5/bungee/netty/ChannelWrapper.java#L50 image

Outfluencer commented 2 years ago

We are sending it directly to the playe. as I understand is the flush handler the best solution and should work fine

andreasdc commented 2 years ago

We are sending it directly to the playe. as I understand is the flush handler the best solution and should work fine

Bungee sends packets like block change, set slot etc.

Outfluencer commented 2 years ago

Bungee forwards the packets

Outfluencer commented 2 years ago

It reads it from the backend and writes it to the client

Outfluencer commented 2 years ago

Also it reads from the client and writes to the backend

andreasdc commented 2 years ago

Yes, so you send packets twice, these operations are very expensive.

Outfluencer commented 2 years ago

Yes that’s the basic way a proxy works

andreasdc commented 2 years ago

That's not really good idea when maintaining high performance server

Outfluencer commented 2 years ago

Feel free to use just spigot if you don’t want an proxy to waste performance

andreasdc commented 2 years ago

Yea, but you can't manage multiple servers, redirect players etc.

Outfluencer commented 2 years ago

Yeah to redirect players you need a proxy and a proxy need to read an write all packets for client and server

andreasdc commented 2 years ago

Is there any way to avoid that?

Outfluencer commented 2 years ago

No

andreasdc commented 2 years ago

What if you create virtual server under the bungeecord? You have the server and packets directly for the player, right?

Janmm14 commented 2 years ago

The heavy operation is not the amount of packets, it is the flushing as that leads to a syscall.

Many packets like single block change, entity movement or set slot are small, so if we reduce the flushing by waiting for more packets sent together with those, load can be reduced. And likely there will be a couple movements every tick along other packets, so there will likely always be potential for reduced cpu usage.

When bungee gets 10 packets in one batch for example, there will be 10 flushes currently. If we add sth like the flushconsolidationhandler, it will be 1 flush instead of 10. And that one flush will not take 10x the amount of time, but much less.

I think I will create a pr for this soon, but I have no ability to test its impact.

caoli5288 commented 2 years ago

is waiting for io_uring to save the world.

andreasdc commented 2 years ago

Writing is expensive too, the most important it will be to test the delay that the flushconsolidationhandler, I asked Paper, but they said there will be delay and they don't care about as it can be disabled. Flushing when spigot is flushing will be good.

is waiting for io_uring to save the world.

What's that?

Janmm14 commented 2 years ago

io_uring is a new Linux kernel api aiming to reduce the need for syscalls. Theres a test implementation for that in netty, but only time will tell when its good for production. Not sure whether io_uring is even final in the linux kernel.

Janmm14 commented 1 year ago

3396 might improve speed of this PR a little

andreasdc commented 1 year ago

3396 might improve speed of this PR a little

Oh yea I have that, it might bring some boost.

Janmm14 commented 1 year ago

Its more like 3396 prevents a speed improvements being lower than theoretically possible when using my PR (as only my PR uses ChannelDuplexHandlers, no existing bungee code). In terms of speed that PR does not change much in bungeecord as-is. Shouldve maybe posted this under my pr but too late now.

xism4 commented 1 year ago

Seems 4.1.85 was released, i updated it again.