Closed ateska closed 3 years ago
* The Cython-based PoC implementation (in-house, so far) delivers ~700k parser cycles per second (very close to C++ implementation).
I am...skeptical. This binding is naive - there's definitely room for improvement. That said, there was a cython version, and the improvement was negligible.
The cost of creating a single python object tends to be higher than the entire document parse. So if you're saying you're getting parity...
Happy to incorporate any improvements, but the general goal is to improve performance by avoiding working in Python land.
Here are some hard numbers:
----------------------------------------------------------------
# 'jsonexamples/test.json' 2397 bytes
----------------------------------------------------------------
* cysimdjson parse 539051.85 EPS ( 1.00) 1292.11 MB/s
* libpy_simdjson loads 375380.33 EPS ( 1.44) 899.79 MB/s
* pysimdjson parse 362136.78 EPS ( 1.49) 868.04 MB/s
* orjson loads 112062.53 EPS ( 4.81) 268.61 MB/s
* python json loads 72665.18 EPS ( 7.42) 174.18 MB/s
----------------------------------------------------------------
^ This illustrates the impact of the call (cysimdjson
is Cython-based implementation).
The native (C++) performance is 542339.05 EPS
.
----------------------------------------------------------------
# 'jsonexamples/verysmall.json' 7 bytes
----------------------------------------------------------------
* cysimdjson parse 4414474.38 EPS ( 1.00) 30.90 MB/s
* orjson loads 3698816.51 EPS ( 1.19) 25.89 MB/s
* libpy_simdjson loads 1839016.53 EPS ( 2.40) 12.87 MB/s
* pysimdjson parse 1015434.93 EPS ( 4.35) 7.11 MB/s
* python json loads 526388.08 EPS ( 8.39) 3.68 MB/s
----------------------------------------------------------------
^ This one zooms to this issue even more.
----------------------------------------------------------------
# 'jsonexamples/twitter.json' 631515 bytes
----------------------------------------------------------------
* cysimdjson parse 2651.49 EPS ( 1.00) 1674.46 MB/s
* libpy_simdjson loads 2445.90 EPS ( 1.08) 1544.63 MB/s
* pysimdjson parse 2423.09 EPS ( 1.09) 1530.22 MB/s
* orjson loads 386.69 EPS ( 6.86) 244.20 MB/s
* python json loads 294.36 EPS ( 9.01) 185.89 MB/s
----------------------------------------------------------------
----------------------------------------------------------------
# 'jsonexamples/canada.json' 2251051 bytes
----------------------------------------------------------------
* cysimdjson parse 289.98 EPS ( 1.00) 652.76 MB/s
* pysimdjson parse 284.94 EPS ( 1.02) 641.42 MB/s
* libpy_simdjson loads 278.46 EPS ( 1.04) 626.82 MB/s
* orjson loads 82.70 EPS ( 3.51) 186.17 MB/s
* python json loads 22.69 EPS ( 12.78) 51.09 MB/s
----------------------------------------------------------------
----------------------------------------------------------------
# 'jsonexamples/gsoc-2018.json' 3327831 bytes
----------------------------------------------------------------
* cysimdjson parse 836.00 EPS ( 1.00) 2782.05 MB/s
* pysimdjson parse 744.28 EPS ( 1.12) 2476.84 MB/s
* libpy_simdjson loads 666.20 EPS ( 1.25) 2217.00 MB/s
* orjson loads 166.08 EPS ( 5.03) 552.69 MB/s
* python json loads 113.87 EPS ( 7.34) 378.93 MB/s
----------------------------------------------------------------
The related work has been released here: https://github.com/TeskaLabs/cysimdjson
Re-running your cysimdjson tests, we're now often at parity-or-better. We can definitely do better, but I'm happy with this for now as in exchange for a small difference in speed we are safer (prevent object reuse, prevent memory issues) and more capable (ex: buffer support).
----------------------------------------------------------------
# '/home/tktech/projects/cysimdjson/test/jsonexamples/test.json' 2397 bytes
----------------------------------------------------------------
* pysimdjson parse 1255476.29 EPS ( 1.00) 3009.38 MB/s
* cysimdjson parse 1235306.40 EPS ( 1.02) 2961.03 MB/s
* cysimdjson pad parse 1211152.53 EPS ( 1.04) 2903.13 MB/s
* orjson loads 207861.87 EPS ( 6.04) 498.24 MB/s
* python json loads 135765.75 EPS ( 9.25) 325.43 MB/s
----------------------------------------------------------------
----------------------------------------------------------------
# '/home/tktech/projects/cysimdjson/test/jsonexamples/twitter.json' 631515 bytes
----------------------------------------------------------------
* cysimdjson pad parse 5947.56 EPS ( 1.00) 3755.97 MB/s
* pysimdjson parse 5791.16 EPS ( 1.03) 3657.20 MB/s
* cysimdjson parse 5568.33 EPS ( 1.07) 3516.48 MB/s
* orjson loads 764.81 EPS ( 7.78) 482.99 MB/s
* python json loads 471.92 EPS ( 12.60) 298.02 MB/s
----------------------------------------------------------------
----------------------------------------------------------------
# '/home/tktech/projects/cysimdjson/test/jsonexamples/canada.json' 2251051 bytes
----------------------------------------------------------------
* cysimdjson pad parse 593.03 EPS ( 1.00) 1334.94 MB/s
* cysimdjson parse 554.87 EPS ( 1.07) 1249.04 MB/s
* pysimdjson parse 552.20 EPS ( 1.07) 1243.04 MB/s
* orjson loads 152.71 EPS ( 3.88) 343.75 MB/s
* python json loads 45.87 EPS ( 12.93) 103.26 MB/s
----------------------------------------------------------------
----------------------------------------------------------------
# '/home/tktech/projects/cysimdjson/test/jsonexamples/gsoc-2018.json' 3327831 bytes
----------------------------------------------------------------
* cysimdjson pad parse 1611.95 EPS ( 1.00) 5364.29 MB/s
* cysimdjson parse 1262.62 EPS ( 1.28) 4201.79 MB/s
* pysimdjson parse 1250.95 EPS ( 1.29) 4162.94 MB/s
* orjson loads 290.58 EPS ( 5.55) 967.01 MB/s
* python json loads 220.79 EPS ( 7.30) 734.76 MB/s
----------------------------------------------------------------
----------------------------------------------------------------
# '/home/tktech/projects/cysimdjson/test/jsonexamples/verysmall.json' 7 bytes
----------------------------------------------------------------
* cysimdjson parse 8896208.10 EPS ( 1.00) 62.27 MB/s
* pysimdjson parse 7945949.11 EPS ( 1.12) 55.62 MB/s
* orjson loads 7735180.97 EPS ( 1.15) 54.15 MB/s
* cysimdjson pad parse 6078851.00 EPS ( 1.46) 42.55 MB/s
* python json loads 1102638.34 EPS ( 8.07) 7.72 MB/s
----------------------------------------------------------------
I haven't studies your implementation in detail but I would be super-useful (for us :-) ) if there is an official way how I can retrieve SIMDJSON C++ object (reference/pointer to that) from a Python wrapper when passed to other Cython code outside of this library. We frequently use Cython for acceleration and passing values from C++ in SIMDJSON thru pysimdjson and Python back to Cython represents an unnecessary yet significant performance hit. May I kindly ask if that bit has been in any form or shape a part of the design?
It would be wonderful to see it in the pysimdjson v4 b/c after that we can "merge/close" cysimdjson implementation ;-)
@ateska do you have any small usage examples? Helps when adding a feature to see how it'll be used. This should probably be a new issue.
We can definitely do this easily with pycapsules and buffers.
I'll try to provide some ... in the new issue ;-) Thanks.
FYI: I managed to run the benchmark again, on the new branch of cysimdjson and I got this:
% PYTHONPATH=. python3 ./perftest/test_benchmark.py
----------------------------------------------------------------
# 'perftest/jsonexamples/test.json' 2397 bytes
----------------------------------------------------------------
* cysimdjson parse 638926.23 EPS ( 1.00) 1531.51 MB/s
* cysimdjson pad parse 606547.25 EPS ( 1.05) 1453.89 MB/s
* pysimdjson parse 606379.25 EPS ( 1.05) 1453.49 MB/s
* python json loads 41720.92 EPS ( 15.31) 100.01 MB/s
----------------------------------------------------------------
----------------------------------------------------------------
# 'perftest/jsonexamples/twitter.json' 631515 bytes
----------------------------------------------------------------
* cysimdjson pad parse 3304.32 EPS ( 1.00) 2086.73 MB/s
* cysimdjson parse 2985.17 EPS ( 1.11) 1885.18 MB/s
* pysimdjson parse 2906.61 EPS ( 1.14) 1835.57 MB/s
* python json loads 204.97 EPS ( 16.12) 129.44 MB/s
----------------------------------------------------------------
----------------------------------------------------------------
# 'perftest/jsonexamples/canada.json' 2251051 bytes
----------------------------------------------------------------
* cysimdjson pad parse 289.50 EPS ( 1.00) 651.68 MB/s
* cysimdjson parse 281.92 EPS ( 1.03) 634.63 MB/s
* pysimdjson parse 262.74 EPS ( 1.10) 591.45 MB/s
* python json loads 19.49 EPS ( 14.85) 43.87 MB/s
----------------------------------------------------------------
----------------------------------------------------------------
# 'perftest/jsonexamples/gsoc-2018.json' 3327831 bytes
----------------------------------------------------------------
* cysimdjson pad parse 781.42 EPS ( 1.00) 2600.42 MB/s
* cysimdjson parse 637.85 EPS ( 1.23) 2122.65 MB/s
* pysimdjson parse 536.78 EPS ( 1.46) 1786.31 MB/s
* python json loads 69.80 EPS ( 11.19) 232.30 MB/s
----------------------------------------------------------------
----------------------------------------------------------------
# 'perftest/jsonexamples/verysmall.json' 7 bytes
----------------------------------------------------------------
* cysimdjson parse 2605313.38 EPS ( 1.00) 18.24 MB/s
* cysimdjson pad parse 2571813.54 EPS ( 1.01) 18.00 MB/s
* pysimdjson parse 2312177.01 EPS ( 1.13) 16.19 MB/s
* python json loads 436467.51 EPS ( 5.97) 3.06 MB/s
----------------------------------------------------------------
I will dig a bit deeper and update this.
We are parsing a very high number of ~2KB JSON files in our Python-based application.
I also conducted a rather artificial test of "how many parser cycles" can I get with basically empty JSON (
{}
). The issue here is quite visible, the overhead of the Python<->pysymdjson boundary crossing is high relatively to other possible implementations.A "parser cycle" is defined as a one call to
parser.parse(json)
on the existing parser instance.I'm not 100% sure if this is a priority of this library, so feel free to close this one as irrelevant.