fabienrenaud / java-json-benchmark

Performance testing of serialization and deserialization of Java JSON libraries
MIT License
955 stars 132 forks source link

Clarify benchmarks #96

Open Tillerino opened 4 months ago

Tillerino commented 4 months ago

Hi there!

Thank you for creating and maintaining this benchmark. I am writing my own databind library and these benchmarks are just so great to have. I can plop in my own library within a couple of minutes and have great comparisons. (I'll get back to you when/if I have something presentable to add here :slightly_smiling_face:)

I am a little worried about the following. You write in the README:

When available, both databinding and 'stream' (custom packing and unpacking) implementations are tested.

but nothing further about how these alternatives are then weighed against each other. I'll just pick Jackson as an example here (not because I have anything against it, just because I know it best).

When I think Jackson, I think jackson-databind a.k.a. everyone's best friend: the ObjectMapper. However, from what I gather from the raw data, the results for Jackson shown in the graphs appear to be from the "stream" test, i.e. custom serialization based on jackson-core. This explains why "jackson" comes even close to "jackson_afterburner" and "jackson_blackbird". In reality, the vanilla ObjectMapper is quite a bit slower than afterburner and blackbird, but looking at the graph, I get the impression that there is no reason to even choose afterburner or blackbird, because there is no performance benefit.

So don't get me wrong: This is not about correctness, but about the impression that I get at a glance, which is probably what most people take away from benchmarks like this.

So I guess I am asking if you can clarify this difference a bit in the way that the results are presented. Again, picking Jackson as an example: I would expect any result that just says "jackson" to be from the vanilla ObjectMapper, but seeing the raw speed of the jackson-core parser is also interesting. Maybe both could be shown in the graph.

Anyways, nice project, cheers!

fabienrenaud commented 4 months ago

Good catch. These used to be reported in 2017 and before https://github.com/fabienrenaud/java-json-benchmark/blob/master/archive/raw-results-2017-05-21.md but something must have broke in the toolchain to no longer include the stream/databind suffixes.

Philzen commented 2 months ago

Actually, comparing https://github.com/fabienrenaud/java-json-benchmark/blob/master/archive/raw-results-2024-01-30.md and https://docs.google.com/spreadsheets/d/1a4kgv2R-IxANE_itV-qJwCnEBvc0HqHGh4bp4AXTFoY/edit?pli=1#gid=295954490, one can find some results from both /stream and /databind in the raw result (i.e. GSON), however only one of the numbers made it to the spreadsheet and it is completely unclear which one was picked.

fabienrenaud commented 2 months ago

Yeah, the conversion script is probably broken somewhere in the way it’s parsing the raw data. Definitely something that needs fixing before the next run.

Fabien

On Sun, May 26, 2024 at 12:06 Philzen @.***> wrote:

Actually, comparing https://github.com/fabienrenaud/java-json-benchmark/blob/master/archive/raw-results-2024-01-30.md and https://docs.google.com/spreadsheets/d/1a4kgv2R-IxANE_itV-qJwCnEBvc0HqHGh4bp4AXTFoY/edit?pli=1#gid=295954490, one can find some results from both /stream and /databind in the raw result (i.e. GSON), however only one the numbers made it to the spreadsheet and it is completely unclear which one was picked.

— Reply to this email directly, view it on GitHub https://github.com/fabienrenaud/java-json-benchmark/issues/96#issuecomment-2132366806, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXX2KIUDHSPOTDG2XAMPH3ZEIXDFAVCNFSM6AAAAABEUPGKJCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZSGM3DMOBQGY . You are receiving this because you commented.Message ID: @.***>

fabienrenaud commented 1 month ago

It's probably this line parsing the lib name wrong: https://github.com/fabienrenaud/java-json-benchmark/blob/master/output/toCsv.py#L31

You can see the expected regex in the toMd script: https://github.com/fabienrenaud/java-json-benchmark/blob/master/output/toMd.sh

Should be something.(databind|stream).(Deserialization|Serialization).<libname>, but clearly the python script only captures libname...