miloyip / nativejson-benchmark

C/C++ JSON parser/generator benchmark
MIT License
1.97k stars 262 forks source link

Data format normalization #53

Open coinex opened 8 years ago

coinex commented 8 years ago

It might be useful to "normalize" results by moving objects into maps and making copies of input strings, for libraries that destroy inputs & use lists for maps. Just to see how much of the penalty is a result of these shortcuts. Conformance testing helps a lot here... you can see some very fast libraries that sacrifice conformance. However, internal data format and overall tl "friendliness" is another area to compare.

For example:

But the parsers are pretty similar otherwise...and should perform about the same.

miloyip commented 8 years ago

This is quite difficult to "normalize" these differences. Some parsers get better performance by specially designed containers. Also, copying a data structure to another adds a lot of overheads.

It may be useful to add some benchmarks on DOM accessing speed, so that map/unordered_map implementation should get better results.

In your stated example will also reflect in some benchmarks, for example, destorying input (insitu parsing) will normally have higher memory footprint (as the other parts of non-string values in JSON cannot be released).

Actually, some "normalization" has been done already. For example some parsers lazily skip the number parsing part. To be fair, we should assume every values are parsed. So an additional conversion pass is added to those parser tests.