Open ajeetdsouza opened 2 years ago
Thanks for the suggestion and the kind words. I think that adding zero-copy benchmarks for serde-based crates is a good idea, though some work will have to go into making a set of structs that are compatible. The deserialize benchmark is meant to benchmark the cost of creating a completely owned structure, which is why this would have to go in a new benchmark. This is a bit unintuitive, so I'll add some clarification to the deserialize benchmark description.
The deserialize benchmark bans all zero-copy features so that a fair comparison can be made between the zero-copy frameworks (e.g. rkyv, abomonation) and the more traditional serialization frameworks (e.g. bincode, serde-json). This is so that users who plan on completely deserializing their objects have a benchmark for all of the available options.
Two questions that I have are:
How much ZCD should serde-like libraries do? Just strings?
As much as possible. In bincode's case, I don't think they have anything other than strings (str
/OsStr
/Path
/[u8]
). For your benchmarks, I think the natural way would be to add another entry for bincode (borrowed)
in the first table itself with an asterisk explaining the situation. Bincode doesn't really fit in the ZCD table, because it's generating a native Rust struct, so there's no concept of access / update here.
Should the deserialize benchmark be measured by deserializing directly into an owned structure, or by deserializing into a borrowed structure and then cloning?
If there's a substantial difference, I think I'd want to see both. It would help people writing code in performance-sensitive contexts to know what to expect in each case.
I noticed that the structs you're using with serde-based deserializers actually have an owned
String
in them. However, bincode doesn't need to copy strings (I'm not aware about the others).I'm actually using this in zoxide right now. Here's my struct definitions:
Since the necessary bytes are already in the
Vec<u8>
that you're trying to deserialize, bincode will just give you a pointer to it instead of copying a whole newString
into memory, which is much faster. You can verify this by checking that the deserialized struct actually contains aCow::Borrowed
. You might be able to use a regular&str
too, although I'm not sure how bincode would handle different endianness in that case.P.S. thanks for your fantastic work on rkyv and these benchmarks! Not only is rkyv's performance amazing, it's ideas like structver that would make this a serious contender for anyone wanting a serialization library. I'm currently using a poor man's structver myself, so the idea of rkyv handling this for me sounds absolutely great.