soasme / PeppaPEG

PEG Parser in ANSI C
https://soasme.com/PeppaPEG
MIT License
55 stars 7 forks source link

Performance compared to Lua #149

Open mingodad opened 2 years ago

mingodad commented 2 years ago

Is your feature request related to a problem? Please describe. Comparing performance of json parser with Lua show that PeppaPEG is 2x memory hungry and takes 2x more time.

Describe the solution you'd like At least the same performance or better.

Here is the Lua script using dkjson (http://dkolf.de/src/dkjson-lua.fsl/home):

local json = require ("dkjson")

local str = io.open(arg[1]):read("a*")

local obj, pos, err = json.decode (str, 1, nil)
if err then
  print ("Error:", err)
else
  for k,v in pairs(obj) do
    print (k, type(v))
  end
end

And here is the script testing all with this file as imput (https://github.com/rust-lang/rls/blob/master/rls-analysis/test_data/rls-analysis/librls_data.json 7MB) :

#!/bin/sh
fname="rustc-1.49.0-src/src/tools/rls/rls-analysis/test_data/rls-analysis/librls_data.json"
/usr/bin/time lua decode-json-file.lua $fname
/usr/bin/time lua decode-json-file-lpeg.lua $fname

/usr/bin/time PeppaPEG-DAD/cli parse -G json.peg -e entry $fname > /dev/null

And here is the output:

./test-big.sh
compilation table
refs    table
config  table
defs    table
imports table
version string
relations   table
impls   table
prelude table
macro_refs  table
1.23user 0.00system 0:01.23elapsed 100%CPU (0avgtext+0avgdata 59844maxresident)k
0inputs+0outputs (0major+16432minor)pagefaults 0swaps
config  table
imports table
defs    table
compilation table
refs    table
impls   table
macro_refs  table
version string
relations   table
prelude table
0.37user 0.01system 0:00.38elapsed 99%CPU (0avgtext+0avgdata 37248maxresident)k
0inputs+0outputs (0major+10731minor)pagefaults 0swaps
0   true
2.87user 0.02system 0:02.89elapsed 99%CPU (0avgtext+0avgdata 124372maxresident)k
0inputs+0outputs (0major+30745minor)pagefaults 0swaps
mingodad commented 2 years ago

Doing the test again with my fork of PeppaPEG compiled with P4_SIZE_T_UINT defined to use unsigned int instead of size_t the memory needed was reduced by 1/3 and time needed to 1/2 as before, also added https://github.com/waywardgeek/datadraw/tree/master/examples/JSON to the comparison (although it's custom parser but it uses datadraw as main data structure).

#!/bin/sh
fname="rustc-1.49.0-src/src/tools/rls/rls-analysis/test_data/rls-analysis/librls_data.json"
/usr/bin/time lua decode-json-file.lua $fname
/usr/bin/time lua decode-json-file-lpeg.lua $fname

/usr/bin/time PeppaPEG-DAD/cli parse -G json.peg -e entry $fname > /dev/null
/usr/bin/time PeppaPEG-DAD/cli32 parse -G json.peg -e entry $fname > /dev/null

/usr/bin/time datadraw/examples/JSON/json < $fname > /dev/null 

Output:

./test-big.sh
config  table
defs    table
refs    table
compilation table
macro_refs  table
impls   table
version string
relations   table
prelude table
imports table
1.29user 0.01system 0:01.31elapsed 100%CPU (0avgtext+0avgdata 59696maxresident)k
0inputs+0outputs (0major+16431minor)pagefaults 0swaps
config  table
impls   table
defs    table
prelude table
refs    table
macro_refs  table
relations   table
imports table
compilation table
version string
0.35user 0.02system 0:00.38elapsed 100%CPU (0avgtext+0avgdata 37096maxresident)k
0inputs+0outputs (0major+10727minor)pagefaults 0swaps
2.87user 0.03system 0:02.90elapsed 100%CPU (0avgtext+0avgdata 124040maxresident)k
0inputs+0outputs (0major+30744minor)pagefaults 0swaps
1.36user 0.04system 0:01.40elapsed 99%CPU (0avgtext+0avgdata 91276maxresident)k
0inputs+0outputs (0major+22533minor)pagefaults 0swaps
0.18user 0.00system 0:00.19elapsed 100%CPU (0avgtext+0avgdata 28956maxresident)k
0inputs+0outputs (0major+7000minor)pagefaults 0swaps
mingodad commented 2 years ago

On all my previous measures I was using ./cli compiled without optimizations and that was my fault, here is the results compiling with -O2 optimizations (the difference when defining P4_SIZE_T_UINT is only noticeably in memory usage (1/3 less when running on 64bits) ):

./test-big.sh
defs    table
refs    table
config  table
prelude table
relations   table
imports table
impls   table
macro_refs  table
version string
compilation table
1.20user 0.03system 0:01.23elapsed 99%CPU (0avgtext+0avgdata 59720maxresident)k
0inputs+0outputs (0major+16431minor)pagefaults 0swaps
impls   table
defs    table
config  table
prelude table
imports table
version string
macro_refs  table
relations   table
compilation table
refs    table
0.36user 0.02system 0:00.38elapsed 99%CPU (0avgtext+0avgdata 37100maxresident)k
0inputs+0outputs (0major+10727minor)pagefaults 0swaps
1.38user 0.02system 0:01.41elapsed 99%CPU (0avgtext+0avgdata 124268maxresident)k
0inputs+0outputs (0major+30746minor)pagefaults 0swaps
1.34user 0.04system 0:01.38elapsed 100%CPU (0avgtext+0avgdata 91276maxresident)k
0inputs+0outputs (0major+22531minor)pagefaults 0swaps
0.18user 0.01system 0:00.19elapsed 99%CPU (0avgtext+0avgdata 28992maxresident)k
0inputs+0outputs (0major+7000minor)pagefaults 0swaps
soasme commented 2 years ago

@mingodad I have not used dkjson before and would like to understand why lpeg/lua is out performant. Any idea?

A quick scan on lpeg code leads me to a possible cause: lpeg compiles grammar to bytecode and evaluate the input on a vm/stack based interpreter, while the current C implementation of Peppa PEG is a recursion based. The latter approach requires more context to save and more instructions to execute. Also, in lpeg, capture states are pre-allocated instead of malloc just-in-time like in Peppa PEG.