I have a Merkle tree data structure that consists of lots of hashes. Hashes are fixed-size byte arrays. postcard serializes them (through the byte array → tuple of bytes) by simply writing them to the target. This could be very fast in theory.
own is basically just passing the byte array directly into the output, with no length prefix.
bytes uses serde's serialize_bytes which includes a length prefix and then dumps the rest of the bytes directly.
big_array uses serde-big-array, and is irrelevant for this issue.
fixed_size uses serde's impl Serialize for [u8; 32], no length prefix.
variable_size uses serde's impl Serialize for [u8], which includes a length prefix.
As you can see, by using serde, I'd be leaving a lot of performance on the table. Dumping the input array directly into the target costs 5 ns, and the best I can do with serde is 12 ns if I accept the wasted extra byte in storage, or 15 ns, if I do not. This leads to a measured >2x real-world performance degradation of the serialization of the tree structure that I have.
I have a Merkle tree data structure that consists of lots of hashes. Hashes are fixed-size byte arrays.
postcard
serializes them (through the byte array → tuple of bytes) by simply writing them to the target. This could be very fast in theory.In practice, as posted in https://github.com/est31/serde-big-array/issues/19, we see the following:
own
is basically just passing the byte array directly into the output, with no length prefix.bytes
uses serde'sserialize_bytes
which includes a length prefix and then dumps the rest of the bytes directly.big_array
usesserde-big-array
, and is irrelevant for this issue.fixed_size
uses serde'simpl Serialize for [u8; 32]
, no length prefix.variable_size
uses serde'simpl Serialize for [u8]
, which includes a length prefix.As you can see, by using
serde
, I'd be leaving a lot of performance on the table. Dumping the input array directly into the target costs 5 ns, and the best I can do with serde is 12 ns if I accept the wasted extra byte in storage, or 15 ns, if I do not. This leads to a measured >2x real-world performance degradation of the serialization of the tree structure that I have.