djkoloski / rust_serialization_benchmark

Benchmarks for rust serialization frameworks
506 stars 48 forks source link

Framework hinting standard #43

Open djkoloski opened 11 months ago

djkoloski commented 11 months ago

Some crates are leaving performance on the table because they're not taking advantage of data invariants. For example:

Coming up with a standard for what is and is not admissable for crates to do is important for users to get a realistic view of how they would perform for the majority of users. I think my current position is that we should:

  1. Document exactly which invariants crates are allowed to rely on. This encompasses data ranges and valid values; anything outside of the documented invariants are off-limits.
  2. Allow some minor optimizations. Anything that only requires a little work is OK. My unhelpful metric would be "as much help as you can get from five minutes in a text chat".
  3. Make the benchmarks more comprehensive. If crates are able to turn a benchmark into a memcpy war, then their punishment will be more difficult schemas. This should be the preferred route for foiling metagaming.

@finnbear wrote up an insightful opinion on the matter:

I totally agree the benchmarks should not be gamed in any way and that, to the extent possible, they should represent the performance real users can expect to see!

That said, I think the term "default performance" is entirely subjective because performance depends on:

* How the data are distributed:

  * Are are values equally likely or are only a few very small values possible?
  * Do the assumptions made by the format match the distribution of the data?

* Whether optimizations are in place:

  * Using enums instead of strings when appropriate
  * Not using unnecessarily large integer types
  * Using available hints when the data distribution is known in advance

As for "default," some formats require:

* creating protocol files

* installing specialized protocol compilers

* manually implementing functions

* running compression algorithms to get competitive file sizes

I think some optimization is a fair ask for users who care about performance. In other words, the recommended way to integrate bitcode is not just to import the crate and derive the traits, but to

* spend a few minutes replacing string with enums where appropriate

* apply domain knowledge of your dataset into minimal width integers and/or range hints

Without the hints, do we get an incorrect general view of bitcode's performance?

There are a few areas in which I think that the benchmarks unrealistically hold back non-hinted bitcode:

* Log

  * Using strings instead of numbers/enums for month and timezone (collectively the date) and http method/protocol
  * Using a `u16` when there are only 63 possible response codes
  * Storing a number that maxes out at 100,000,000 in a `u64` helps encodings that do var-int by default

* Mesh

  * Randomly generated floats are always between 0 and 1, allowing compression schemes on bytewise formats to take advantage, while not timing the compression in the speed benchmarks.

* Minecraft Save Data

  * Using strings instead of enums for item/entity ids, recipes, dimensions

The currently-existing bitcode hints can only recover some of the performance lost due to the above.

TL;DR: To get the most realistic view of bitcode's performance (in order of importance):

1. apply the aforementioned optimizations

2. include total time (serialization _and_ compression) next to compressed sizes for all formats

3. do come up with an explicit policy on hinting/optimizing

4. consistently apply that policy, including with respect to `bitcode`
finnbear commented 11 months ago

I'm really happy about this plan to improve the benchmarks! :star2: :clap:

Document exactly which invariants crates are allowed to rely on. This encompasses data ranges and valid values; anything outside of the documented invariants are off-limits.

Strongly agree, but do be sure to define the phrase "rely on." I see a distinction between crashing/erroring if the invariant is broken and merely pessimizing the performance/size. For reference, bitcode hints are of the latter category (if you're wrong, you haven't broken anything except your bandwidth costs).

Make the benchmarks more comprehensive. If crates are able to turn a benchmark into a memcpy war, then their punishment will be more difficult schemas. This should be the preferred route for foiling metagaming.

Yes please, but do note there is also a compression war going on :wink: :crossed_swords:

Compression algorithms will happily rely on any invariant they can detect, so please either remove those invariants or make them more realistic:

Finally, don't forget to time the compression!

Allow some minor optimizations. Anything that only requires a little work is OK. My unhelpful metric would be "as much help as you can get from five minutes in a text chat".

That is a helpful metric! :heart:

For complex schemas, I'd say five minutes per type would be more scalable :ok_hand: