miloyip / nativejson-benchmark

C/C++ JSON parser/generator benchmark
MIT License
1.98k stars 261 forks source link

Simdjson v0.3 Support #131

Closed Mark407 closed 4 years ago

Mark407 commented 4 years ago

Added support for simdjson (#113) (https://github.com/simdjson/simdjson) Currently it does not support the SAX/ prettify tests as far as I am aware, so these have been excluded.

(Also the current build appears to be broken - I was using the workarounds in https://github.com/mloskot/nativejson-benchmark/tree/ml/issue-102-add-workarounds-to-build after which this PR will compile)

Results (on a i7 8550U) with rapidjson for comparison:

Benchmarking Performance of simdjson

Parse canada.json ... 3.974 ms 540.204 MB/s Parse citm_catalog.json ... 1.039 ms 1585.361 MB/s Parse twitter.json ... 0.441 ms 1365.666 MB/s Stringify canada.json ... 69.225 ms 31.011 MB/s Stringify citm_catalog.json ... 5.887 ms 279.801 MB/s Stringify twitter.json ... 7.138 ms 84.374 MB/s Statistics canada.json ... 1.221 ms 1758.206 MB/s Statistics citm_catalog.json ... 0.436 ms 3777.959 MB/s Statistics twitter.json ... 0.155 ms 3885.540 MB/s

Benchmarking Performance of RapidJSON (C++) Parse canada.json ... 5.153 ms 416.606 MB/s Parse citm_catalog.json ... 2.563 ms 642.680 MB/s Parse twitter.json ... 1.786 ms 337.211 MB/s Stringify canada.json ... 9.878 ms 217.328 MB/s Stringify citm_catalog.json ... 1.270 ms 1297.000 MB/s Stringify twitter.json ... 0.979 ms 615.177 MB/s Statistics canada.json ... 0.781 ms 2748.745 MB/s Statistics citm_catalog.json ... 0.234 ms 7039.274 MB/s Statistics twitter.json ... 0.092 ms 6546.290 MB/s

miloyip commented 4 years ago

Just after merging, I found that the numbers are quite strange. I didn't read the source code of simdjson, but for parsing, it may not fully parse and convert the data. Currently, tests of all libraries will fully parse the JSONs. If some of them just "lazy evaluate" some values, the test will traverse the DOM and read the value to force it executing the evaluation. Can you help investigating this?

lemire commented 4 years ago

@miloyip We definitively fully parse the data in simdjson.