only-cliches / NoProto

Flexible, Fast & Compact Serialization with RPC
MIT License
373 stars 14 forks source link

Please include bincode in benchmarks and comparisons #6

Closed joshtriplett closed 3 years ago

joshtriplett commented 3 years ago

I'd love to see the bincode crate included in the benchmarks and comparisons.

only-cliches commented 3 years ago

Wow, awesome to see someone from the lang team on here!

Here are the places where NoProto excels compared to competing formats, based on my understanding of things.

  1. Flexible At Runtime If you need to work with data types that will change throughout the runtime of your application, you normally have to pick something like JSON since highly optimized formats like Flatbuffers and Bincode depend on compiling data types into your application. As far as I can tell NoProto is the fastest format that doesn't require you to compile data types into your application.

  2. Extremely Fast Updates If you have a workflow in your application that is read -> modify -> write with buffers, NoProto will usually outperform every other format, including Bincode and Flatbuffers. This is because NoProto never actually deserializes, it doesn't need to. I wrote this library with databases in mind, if you want to support client requests like "change username field to X", NoProto will do this faster than any other format I've seen, usually orders of magnitude faster. This includes complicated mutations like "push a value onto the end of this nested list".

  3. Incremental Deserializing You only pay for the fields you read, no more. Like I said above, there is no deserializing step in NoProto, opening a buffer typically performs no operations (except for sorted buffers, which is opt in). Once you start asking for fields, the library will navigate the buffer using the format rules to get just what you asked for and nothing else. If you have a workflow in your application where you read a buffer and only grab a few fields inside it, NoProto will outperform most other libraries.

If your use case doesn't fit something mentioned in the 3 points above, Bincode / Flatbuffers / CapNProto is a better choice. Hope that makes sense!

This benchmark was ran with v0.7.1 of NoProto

Library Encode Decode All Decode 1 Update 1 Size (bytes) Size (Zlib)
NoProto 1,209 1,653 50,000 14,085 209 167
Flatbuffers 1,189 15,625 250,000 1,200 264 181
Bincode 6,250 9,434 10,309 4,367 163 129
Protocol Buffers 2 958 1,263 1,285 556 154 141
MessagePack 154 242 271 136 296 187
JSON 606 471 605 445 439 184
BSON 127 122 132 96 414 216

As always the benchmark source code is in the bench folder of the repo if you want to inspect the code or run the benchmark yourself.

joshtriplett commented 3 years ago

@only-cliches That's extremely helpful, thank you! That makes the tradeoffs between size, performance, and type of operation (all data vs single-item decode/update) really clear, and those are some impressive benchmark results.

Would you consider adding bincode to the two comparison tables in the README?

only-cliches commented 3 years ago

Done! I also added further points to the ones above in the Readme: https://github.com/only-cliches/NoProto#noproto-strengths

It hadn't occurred to me before, but the failure mode for most of these high performance formats is pretty crazy.

joshtriplett commented 3 years ago

Much appreciated, thank you!