Open mingodad opened 2 years ago
Doing the test again with my fork of PeppaPEG compiled with P4_SIZE_T_UINT defined to use unsigned int
instead of size_t
the memory needed was reduced by 1/3 and time needed to 1/2 as before, also added https://github.com/waywardgeek/datadraw/tree/master/examples/JSON to the comparison (although it's custom parser but it uses datadraw
as main data structure).
#!/bin/sh
fname="rustc-1.49.0-src/src/tools/rls/rls-analysis/test_data/rls-analysis/librls_data.json"
/usr/bin/time lua decode-json-file.lua $fname
/usr/bin/time lua decode-json-file-lpeg.lua $fname
/usr/bin/time PeppaPEG-DAD/cli parse -G json.peg -e entry $fname > /dev/null
/usr/bin/time PeppaPEG-DAD/cli32 parse -G json.peg -e entry $fname > /dev/null
/usr/bin/time datadraw/examples/JSON/json < $fname > /dev/null
Output:
./test-big.sh
config table
defs table
refs table
compilation table
macro_refs table
impls table
version string
relations table
prelude table
imports table
1.29user 0.01system 0:01.31elapsed 100%CPU (0avgtext+0avgdata 59696maxresident)k
0inputs+0outputs (0major+16431minor)pagefaults 0swaps
config table
impls table
defs table
prelude table
refs table
macro_refs table
relations table
imports table
compilation table
version string
0.35user 0.02system 0:00.38elapsed 100%CPU (0avgtext+0avgdata 37096maxresident)k
0inputs+0outputs (0major+10727minor)pagefaults 0swaps
2.87user 0.03system 0:02.90elapsed 100%CPU (0avgtext+0avgdata 124040maxresident)k
0inputs+0outputs (0major+30744minor)pagefaults 0swaps
1.36user 0.04system 0:01.40elapsed 99%CPU (0avgtext+0avgdata 91276maxresident)k
0inputs+0outputs (0major+22533minor)pagefaults 0swaps
0.18user 0.00system 0:00.19elapsed 100%CPU (0avgtext+0avgdata 28956maxresident)k
0inputs+0outputs (0major+7000minor)pagefaults 0swaps
On all my previous measures I was using ./cli
compiled without optimizations and that was my fault, here is the results compiling with -O2
optimizations (the difference when defining P4_SIZE_T_UINT is only noticeably in memory usage (1/3 less when running on 64bits) ):
./test-big.sh
defs table
refs table
config table
prelude table
relations table
imports table
impls table
macro_refs table
version string
compilation table
1.20user 0.03system 0:01.23elapsed 99%CPU (0avgtext+0avgdata 59720maxresident)k
0inputs+0outputs (0major+16431minor)pagefaults 0swaps
impls table
defs table
config table
prelude table
imports table
version string
macro_refs table
relations table
compilation table
refs table
0.36user 0.02system 0:00.38elapsed 99%CPU (0avgtext+0avgdata 37100maxresident)k
0inputs+0outputs (0major+10727minor)pagefaults 0swaps
1.38user 0.02system 0:01.41elapsed 99%CPU (0avgtext+0avgdata 124268maxresident)k
0inputs+0outputs (0major+30746minor)pagefaults 0swaps
1.34user 0.04system 0:01.38elapsed 100%CPU (0avgtext+0avgdata 91276maxresident)k
0inputs+0outputs (0major+22531minor)pagefaults 0swaps
0.18user 0.01system 0:00.19elapsed 99%CPU (0avgtext+0avgdata 28992maxresident)k
0inputs+0outputs (0major+7000minor)pagefaults 0swaps
@mingodad I have not used dkjson before and would like to understand why lpeg/lua is out performant. Any idea?
A quick scan on lpeg code leads me to a possible cause: lpeg compiles grammar to bytecode and evaluate the input on a vm/stack based interpreter, while the current C implementation of Peppa PEG is a recursion based. The latter approach requires more context to save and more instructions to execute. Also, in lpeg, capture states are pre-allocated instead of malloc just-in-time like in Peppa PEG.
Is your feature request related to a problem? Please describe. Comparing performance of json parser with Lua show that PeppaPEG is 2x memory hungry and takes 2x more time.
Describe the solution you'd like At least the same performance or better.
Here is the Lua script using dkjson (http://dkolf.de/src/dkjson-lua.fsl/home):
And here is the script testing all with this file as imput (https://github.com/rust-lang/rls/blob/master/rls-analysis/test_data/rls-analysis/librls_data.json 7MB) :
And here is the output: