djkoloski / rust_serialization_benchmark

Benchmarks for rust serialization frameworks
527 stars 48 forks source link

Benchmark including I/O #22

Open elbaro opened 2 years ago

elbaro commented 2 years ago

Adding a benchmark including I/O time would be useful. Usually serialization/deserialization involves network or disk I/O, and the medium affects the overall performance.

For example, it is unclear from the table in README.md whether rkyv/speedy are faster than prost in real scenarios because the disk read is faster with protobuf. Probably rkyv/speedy are faster than protobuf in LAN. Are they faster in SSD? We do not know.

These additions will help readers make a quick choice on the library.

djkoloski commented 2 years ago

I think this would be nice to have, but I'm unsure how feasible/useful it would be to provide these numbers. The IO time should depend solely on the IO speed of the device for reads and writes, since an apples-to-apples comparison would:

Benchmarking an mmap to acquire the bytes could be useful for some users, but I would definitely be concerned about the portability of those results across hardware. Even HDDs can have a lot of variability in performance across manufacturers, especially with regards to random-access reads.

Additionally, there's currently no dedicated hardware for benchmarking and I do not have some of the suggested configurations. Perhaps some of these (like Network) could be emulated?

If you have any suggestions on how to approach these problems, I'd be glad to hear them!

elbaro commented 2 years ago

I meant to suggest providing very approximate numbers like this.

For example, consider a case of log:

Format / Lib Serialize Deserialize Size Zlib
rkyv 306.63 us 3.2056 ms* 3.9919 ms* 1011488 269353

Using this very rough number Read 1,000,000 bytes sequentially from disk: 825,000ns, ignoring disk-seek time and other factors, reading 1011488 bytes on HDD takes 1011488/1000000*825us = 834us, which already dominates the unverified zero-copy numbers. (834us >> 16.632us) This tells us that zero-copy deserialization with abomination vs flatbuffers vs rkyv vs alkahest makes little difference on HDD.

Zero-copy deserialization speed

Format / Lib Access Read Update
abomonation 36.589 us* 57.773 us*
capnp 146.66 ns* 496.42 us*
flatbuffers 2.9546 ns* 2.0092 ms* 137.99 us* 2.1892 ms*
rkyv 1.3871 ns* 756.30 us* 16.632 us* 776.72 us* 66.600 us
alkahest 2.0442 ns* 81.230 us*

So I agree many details (HDD rpm, manufacture, mmap, buffered or not, usage pattern, ..) affect the result, I still find it useful that we can eyeball the orders of magnitude of the result. There are several options:

  1. Provide a script with IO that people can run on their own hardware.
  2. Pick a specific hardware and usage pattern, and clarify 'Seagate 5400rpm mmap sequential read 10MB ... with warmup'.

..or add a warning that HDD latency dominates some numbers but network/SSD latencies are negligible. Just realized that network/SSD may be fast enough to be ignored.

djkoloski commented 2 years ago

Those are good suggestions. I think that, of the options available, the first would probably be the gold standard and the second would be good for people passing by. If you really want to know how each library will perform on your hardware, you'll need to run them in the proper environment. How about these concrete actions:

As part of this, I'll probably also need to nicen up the formatting tool and get it in version control as well.