Closed miloyip closed 9 years ago
Hi! Thanks for the bug report. :) Indeed, it was valid. The latest sajson fixes the bug.
Thanks for RapidJSON, by the way. We use it at IMVU, and while I have a few things I don't particularly like about the API, you set a new high bar for everyone.
I have verified the fix in the benchmark. Thank you. If you have suggestions on RapidJSON, please feel free to drop an issue.
I just run with the both latest versions:
Benchmarking RapidJSON (C++)
Parse canada.json ... 9.763 ms 219.888 MB/s
Parse citm_catalog.json ... 5.638 ms 292.159 MB/s
Parse twitter.json ... 4.225 ms 142.546 MB/s
Benchmarking sajson (C++)
Parse canada.json ... 12.538 ms 171.221 MB/s
Parse citm_catalog.json ... 8.717 ms 188.963 MB/s
Parse twitter.json ... 2.661 ms 226.328 MB/s
It seems I need to figure out how to improve the performance for twitter.json
. :smile:
Hah! On my machine / benchmark, (an older version of) rapidjson outperformed sajson with twitter.json. :) I'd post numbers but I don't have access to that VM at the moment.
I've found they tend to go back and forth depending on the CPU and workload... which probably means there's room to improve both parsers. :)
One of my intention for making the benchmark is to learn from others and make improvements together.
I just learn about __builtin_expect
from your code. Is it really beneficial?
It depends on the CPU (branch predictor and whatever), compiler, and data set, but sometimes it can be very beneficial!
In particular, in this particular context in a different project ( https://github.com/chadaustin/buffer-builder/blob/master/cbits/buffer.cpp#L82 ) it was a substantial win.
I suspect it won because the compiler could then optimize the register usage / spillage for the hot path. You could clearly see in the generated assembly that clang was trying to make the hot path a single sequence of instructions, with only forward jumps for the cold paths. (I believe, by default, forward jumps are predicted not taken, and backward jumps are predicted taken.)
I recently try to add sajson to nativejson-benchmark in this branch.
It was able to parse two test json data, but failed for twitter.json. The error message generated is
with this code:
I am pretty sure the json is valid, as it is obtained from twitter's API, and also passed more than a dozens of parsers.
BTW, I am the author of RapidJSON :)