Add note about bytes serialization perfomance

dunnock commented 4 years ago

It appears that using custom serializer/deserializer for performance benefit is not so obvious. I had an issue discussed on reddit and few people suggested solution, but for most this was quite surprising that serialization/deserialization performance for bytes array might be so slow compared of what you get when using proper module. Perf test https://github.com/dunnock/ipc-bench

Tried to highlight that in the right section of a doc.

insanitybit commented 4 years ago

This is great to call out, but it's not clear to me why it's faster.

Consider this part of regex's docs: https://github.com/rust-lang/regex#usage-avoid-compiling-the-same-regex-in-a-loop

Tells me what the problem is (recompiling is slow)
Explains why it's slow (it takes time, and it also prohibits other optimizations)
Shows the solution and explains why it solves the problem

This example shows me how to solve the problem, but I am left wondering what the problem is, or when to apply the solution.

A great first step, for sure.

dunnock commented 4 years ago

@insanitybit I guess memory copy is faster than iterator over elements. It's quite easy to reproduce and speed difference is orders of magnitude https://github.com/dunnock/ipc-bench/blob/master/benches/bincode.rs:

For a simple struct

#[derive(Serialize, Deserialize, Clone)]
struct Message {
    pub topic: u32,
        #[serde(with = "serde_bytes")]
    pub data: Vec<u8>
}

Without serde_bytes:

bincode_encode/10240    time:   [87.656 us 89.375 us 91.548 us]                                 
                        thrpt:  [106.67 MiB/s 109.27 MiB/s 111.41 MiB/s]
bincode_decode/10240    time:   [29.083 us 31.390 us 34.096 us]                                  
                        thrpt:  [286.42 MiB/s 311.10 MiB/s 335.78 MiB/s]

With serde_bytes:

bincode_encode/10240    time:   [267.12 ns 275.24 ns 284.70 ns]
                        thrpt:  [33.498 GiB/s 34.649 GiB/s 35.702 GiB/s]
bincode_decode/10240    time:   [298.75 ns 311.51 ns 328.28 ns]
                        thrpt:  [29.051 GiB/s 30.614 GiB/s 31.922 GiB/s]

I can also open issue and PR with test in serde (though is that the right place) if needed, let me know if I can help further

mrobakowski commented 4 years ago

Wow. This needs to be in the docs. Using serde_bytes totally changed experimental results in my master thesis. I just assumed that if I use bincode and have a byte array inside a struct, serializing it would basically be a memcpy.

serde-rs / serde-rs.github.io

Add note about bytes serialization perfomance #105