serde-rs / serde-rs.github.io

https://serde.rs
Creative Commons Attribution Share Alike 4.0 International
22 stars 96 forks source link

Add note about bytes serialization perfomance #105

Open dunnock opened 4 years ago

dunnock commented 4 years ago

It appears that using custom serializer/deserializer for performance benefit is not so obvious. I had an issue discussed on reddit and few people suggested solution, but for most this was quite surprising that serialization/deserialization performance for bytes array might be so slow compared of what you get when using proper module. Perf test https://github.com/dunnock/ipc-bench

Tried to highlight that in the right section of a doc.

insanitybit commented 4 years ago

This is great to call out, but it's not clear to me why it's faster.

Consider this part of regex's docs: https://github.com/rust-lang/regex#usage-avoid-compiling-the-same-regex-in-a-loop

This example shows me how to solve the problem, but I am left wondering what the problem is, or when to apply the solution.

A great first step, for sure.

dunnock commented 4 years ago

@insanitybit I guess memory copy is faster than iterator over elements. It's quite easy to reproduce and speed difference is orders of magnitude https://github.com/dunnock/ipc-bench/blob/master/benches/bincode.rs:

For a simple struct

#[derive(Serialize, Deserialize, Clone)]
struct Message {
    pub topic: u32,
        #[serde(with = "serde_bytes")]
    pub data: Vec<u8>
}

Without serde_bytes:

bincode_encode/10240    time:   [87.656 us 89.375 us 91.548 us]                                 
                        thrpt:  [106.67 MiB/s 109.27 MiB/s 111.41 MiB/s]
bincode_decode/10240    time:   [29.083 us 31.390 us 34.096 us]                                  
                        thrpt:  [286.42 MiB/s 311.10 MiB/s 335.78 MiB/s]

With serde_bytes:

bincode_encode/10240    time:   [267.12 ns 275.24 ns 284.70 ns]
                        thrpt:  [33.498 GiB/s 34.649 GiB/s 35.702 GiB/s]
bincode_decode/10240    time:   [298.75 ns 311.51 ns 328.28 ns]
                        thrpt:  [29.051 GiB/s 30.614 GiB/s 31.922 GiB/s]

I can also open issue and PR with test in serde (though is that the right place) if needed, let me know if I can help further

mrobakowski commented 4 years ago

Wow. This needs to be in the docs. Using serde_bytes totally changed experimental results in my master thesis. I just assumed that if I use bincode and have a byte array inside a struct, serializing it would basically be a memcpy.