space-wizards / RobustToolbox

Robust multiplayer game engine, used by Space Station 14
https://spacestation14.io
Other
537 stars 395 forks source link

Network Level Gamestate Compression #889

Closed PJB3005 closed 3 years ago

PJB3005 commented 4 years ago

We currently use a ton of bandwidth to do netcode.

NetSerializer is not exactly bandwidth efficient as far as I can tell. It also does not provide any way to do delta updates efficiently (for example, in a component state, using bitflags and stuff to only send the new position).

It would be best if we didn't simply run the game state delta through netserializer, and instead used lidgren's NetOutgoingMessage/NetIncomingMessage API to reduce state delta size.

Acruid commented 4 years ago

Looking closer at lidgren's API, you can directly modify the backing array behind the bit reader/writer. This allows us to do things like compress the entire packet payload after crafting the packet with the Read/Write API. The Reader/Writer itself is good enough (I think), no need to create our own BitStream class.

Another major optimization would be a string table, so that we can intern prototype names, and only have to send the string ID over the network. This can be implemented independent of anything else, any time.

PJB3005 commented 4 years ago

Does running regular compression algorithms over a bitstream like that even help much? Ideally we would be using Lidgren's bit packing to the fullest extent.

Acruid commented 4 years ago

The delta compressed bit packed data is not a stream of fixed length words. The delta compression introduces single bits and the bit packed variables can use an arbitrary number of bits. As a result regular entropy encoding is difficult because entropy encoders typically assume a word length of a fixed number of bits.

"Regular" compression algorithms (Deflate) won't work well here.

The delta compression writes out a single zero bit for any variable that did not change. Because only few variables change over time the delta compression introduces a lot of sequences of zero bits. For this reason 3-bit word length zero based run-length compression is used. The compressor reads 3 bits at a time and if the bits are not all zero the same bits are written out without changes. If the 3 bits are all zero the compressor keeps reading words of 3 bits at a time until they are no longer all zero. The compressor then writes out 3 zero bits followed by 3 bits that represent the number of times 3 zero bits were read in succession. Using a 3-bit word length results in a maximum compression ratio of 4:1. A different number of bits than 3 could be used for the word length but for the DOOM III entities using 3 bits results in the best compression ratios in practice. http://mrelusive.com/publications/papers/The-DOOM-III-Network-Architecture.pdf, Section 4.3

So the theoretical ratio is 4:1, they do not mention what their actual average ratio is. After delta compression is implemented, someone should manually analyze our packet payloads and see if there are large groups of zeros like they mention.

moonheart08 commented 4 years ago

Related: #973

Tyler-IN commented 4 years ago

I am demonstrating 10:1 compression in tests using 4 mB buffered level 1 LZMA. It's due to the poor packing of current serialized data. This is faster than and provides better compression ratios versus common libraries (like LZ4) at any setting that are fully cross platform and ready to use.

I am working on a serializer replacement that should provide smaller output, and I implemented 5:1 per-frame compression for big frames on the current NetSerializer implementation using LZMA. A shared string and type dictionary will also greatly reduce the size of serializer output (and decrease compression ratios due to overall better entropy).

I created an experiment using a TCP back-channel that provided > 10:1 compression that relied on shared state to allow small change messages to "go around" large state updates. A shared state back channel is not practical or even reliably possible without a reliable and ordered stream. If a stream is implemented over Lidgren, I can implement that as a UDP back channel path.

I have written a correctly behaving "bubble" system for moving objects, and I'm currently working on handling teleporting objects. This greatly reduces network traffic and eliminates "objects thrown into deep space" network traffic. Once teleporting objects are correctly handled, the next step is to perform comparisons with the previous state as we're keeping it around. However, "delta" compression can't and won't work without either a reliable ordered state sequence or a large "unacknowledged history" state change buffer on the server.

As for enum values, constant strings and other "gotchas" with the current serializer, I intend to have perfect bit packing for them as their sizes will be known by both ends of the serialization path. It is a fools errand to attempt to utilize smaller sized enums and integer fields. When used out of band of the serializer, writing enum values as variably sized ints is sufficiently compact for all of our enums, mostly consuming 1 byte.

We can decorate fields with attributes to identify value ranges to provide high to perfect entropy compression in critical network data structures.

Zumorica commented 3 years ago

Pretty much fixed by #1981