lemunozm / message-io

Fast and easy-to-use event-driven network library.
Apache License 2.0
1.12k stars 75 forks source link

Encoding with less bytes. #60

Closed lemunozm closed 3 years ago

lemunozm commented 3 years ago

Currently, the FramedTcp adapter reaches its target of "transform" TCP from stream-based to packet-based adding an offset of 4 bytes before the packet to determine its size using the encoding module (this module could be used for other adapters but currently is only used for FramedTcp).

For most cases, 4 bytes is too many bytes (most of the messages could use 1 or 2 bytes). Fortunately, bincode has an option to make variadic int encoding. This should be relatively easy to implement.

EDIT: Other cool library to get this functionality instead bincode: integer_encoding

Uriopass commented 3 years ago

Just wondering why you linked bincode, as it is not a dependency of message-io? Do you mean that the code from bincode could be re-used (adapted) to the project?

lemunozm commented 3 years ago

Hi @Uriopass,

The idea is to add this dependency and uses its variadic encoding for integers.

Currently, the encoding module of message-io uses to/from_le_bytes to "encode" the packet size in the output buffer. Instead of using this function, it could be changed by bincode::serialize::<u32>() using the variadic mode of bincode (that is an option that the library has). The encoding module should now use the bincode dependency to reach its target.

Uriopass commented 3 years ago

I think bincode is quite a heavy dependency for such a simple thing. I think a basic implementation (I tried writing one but it was not very pretty, might try again later) could work like this: if first byte is 255, then the 4 (or 3?) next bytes are the message length. if first byte is <255, then this is the length of the message.

You could also use something like the UTF-8 varlength encoding scheme using the first n 1s to know how many bytes are to come.

lemunozm commented 3 years ago

Thanks for the hint!

Of course, any other lighter library to reach the target is also valid.

I just make a fast search and I found a library that does precisely this, without all the extra payload of bincode. It could be a good option: https://docs.rs/integer-encoding/3.0.2/integer_encoding/trait.VarInt.html

Regarding this:

if first byte is 255, then the 4 (or 3?) next bytes are the message length

I think that a lot of messages could have more than 255 byres, be forced to use 4 bytes, instead of 2 bytes in most of the cases.