yhirose / cpp-peglib

A single file C++ header-only PEG (Parsing Expression Grammars) library
MIT License
900 stars 112 forks source link

Taking too much time to load/parse grammar #197

Closed mingodad closed 2 years ago

mingodad commented 2 years ago

Making tests with cpp-peglib with grammars converted from pegjs I found one grammar where peglint and the online playground takes too much time/cpu to load/parse the grammar, the same grammar with leg from https://github.com/mingodad/peg parses instantly (also it's generated instantly).

See attached the grammar used for this test.

Comparison of both parsing the small sql input shown bellow:

/usr/bin/time peg-dad -o sql-parser.c sql-parser-naked.peg
rule '_TODO_' defined but not used
rule 'sym_bslash' defined but not used
rule 'sym_backtick' defined but not used
rule 'sym_dblquote' defined but not used
rule 'id_constraint_column' defined but not used
rule 'id_constraint_table' defined but not used
rule 'start_streaming' defined but not used
0.01user 0.00system 0:00.01elapsed 100%CPU (0avgtext+0avgdata 1944maxresident)k
0inputs+1064outputs (0major+133minor)pagefaults 0swaps

/usr/bin/time ./sql-parser test.sql
yy 1
0.00user 0.00system 0:00.00elapsed 100%CPU (0avgtext+0avgdata 1660maxresident)k
0inputs+0outputs (0major+65minor)pagefaults 0swaps

/usr/bin/time peglint sql-parser-naked.peg test.sql
sql-parser-naked.peg:1361:2: 'sym_dblquote' is not referenced.
sql-parser-naked.peg:5:2: 'start_streaming' is not referenced.
sql-parser-naked.peg:1274:2: 'id_constraint_column' is not referenced.
sql-parser-naked.peg:1364:2: 'sym_backtick' is not referenced.
sql-parser-naked.peg:1874:2: '_TODO_' is not referenced.
sql-parser-naked.peg:1271:2: 'id_constraint_table' is not referenced.
sql-parser-naked.peg:1406:2: 'sym_bslash' is not referenced.
12.43user 0.00system 0:12.43elapsed 100%CPU (0avgtext+0avgdata 5160maxresident)k
0inputs+0outputs (0major+411minor)pagefaults 0swaps

The sql input:

CREATE TABLE t(id INTEGER PRIMARY KEY);
SELECT t2.rowid FROM t JOIN t t2 JOIN t;
SELECT t2.id FROM t JOIN (t t2 JOIN t);
SELECT t2.rowid FROM t JOIN (t t2 JOIN t);

sql-parser.zip

mingodad commented 2 years ago

Here is the top of the profile output:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 11.35     12.39    12.39 897840609     0.00     0.00  __gnu_cxx::__enable_if<std::__is_char<char>::__value, bool>::__type std::operator==<char>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
  7.75     20.85     8.46 99291609     0.00     0.00  std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > std::__find_if<std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, __gnu_cxx::__ops::_Iter_pred<peg::DetectInfiniteLoop::visit(peg::Reference&)::{lambda(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)#1}> >(__gnu_cxx::__ops::_Iter_pred<peg::DetectInfiniteLoop::visit(peg::Reference&)::{lambda(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)#1}>, __gnu_cxx::__ops::_Iter_pred<peg::DetectInfiniteLoop::visit(peg::Reference&)::{lambda(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)#1}>, __gnu_cxx::__ops::_Iter_pred<peg::DetectInfiniteLoop::visit(peg::Reference&)::{lambda(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)#1}>, std::input_iterator_tag)
  6.65     28.11     7.26 897671287     0.00     0.00  std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::operator++()
  6.46     35.17     7.06 1127231139     0.00     0.00  std::operator!=(std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)
  6.38     42.14     6.97 214671719     0.00     0.00  bool __gnu_cxx::__ops::_Iter_pred<peg::DetectInfiniteLoop::visit(peg::Reference&)::{lambda(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)#1}>::operator()<std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >(std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)
  6.25     48.97     6.83 214671719     0.00     0.00  peg::DetectInfiniteLoop::visit(peg::Reference&)::{lambda(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)#1}::operator()(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&) const
  5.40     54.86     5.90 1242250419     0.00     0.00  std::_List_node<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::_M_valptr()
  4.61     59.89     5.03 1242250419     0.00     0.00  __gnu_cxx::__aligned_membuf<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::_M_ptr()
  3.86     64.11     4.22 1242250419     0.00     0.00  __gnu_cxx::__aligned_membuf<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::_M_addr()
  3.79     68.25     4.14 1012957933     0.00     0.00  std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::operator*() const
  3.36     71.92     3.67 683133252     0.00     0.00  bool __gnu_cxx::__ops::_Iter_pred<peg::HasEmptyElement::visit(peg::Reference&)::{lambda(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)#1}>::operator()<std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >(std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)
  2.27     74.40     2.48   506719     0.00     0.00  std::_Head_base<1ul, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, false>::_Head_base(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)
  1.40     75.93     1.53 15488317     0.00     0.00  std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > std::__find_if<std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, __gnu_cxx::__ops::_Iter_pred<peg::HasEmptyElement::visit(peg::Reference&)::{lambda(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)#1}> >(__gnu_cxx::__ops::_Iter_pred<peg::HasEmptyElement::visit(peg::Reference&)::{lambda(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)#1}>, __gnu_cxx::__ops::_Iter_pred<peg::HasEmptyElement::visit(peg::Reference&)::{lambda(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)#1}>, __gnu_cxx::__ops::_Iter_pred<peg::HasEmptyElement::visit(peg::Reference&)::{lambda(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)#1}>, std::input_iterator_tag)
  1.38     77.44     1.51 115152962     0.00     0.00  std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::operator--()
  1.36     78.92     1.49 459359057     0.00     0.00  std::__cxx11::list<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::end()
  1.29     80.33     1.41 175827226     0.00     0.00  __gnu_cxx::__exchange_and_add(int volatile*, int)
  1.05     81.48     1.15 175483855     0.00     0.00  __gnu_cxx::__atomic_add(int volatile*, int)
  1.00     82.57     1.10   506719     0.00     0.00  std::_Tuple_impl<0ul, char const*&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>::_Tuple_impl(char const*&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)
  0.99     83.65     1.08 99291609     0.00     0.00  peg::DetectInfiniteLoop::visit(peg::Reference&)
  0.84     84.57     0.92   506719     0.00     0.00  std::_Tuple_impl<0ul, char const*&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>::_M_head(std::_Tuple_impl<0ul, char const*&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>&)
  0.82     85.47     0.90 688785225     0.00     0.00  std::_List_iterator<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::_List_iterator(std::__detail::_List_node_base*)
  0.82     86.37     0.90 683133252     0.00     0.00  peg::HasEmptyElement::visit(peg::Reference&)::{lambda(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)#1}::operator()(std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&) const
  0.74     87.17     0.81 49565954     0.00     0.00  peg::DetectInfiniteLoop::visit(peg::Sequence&)
  0.72     87.96     0.79 114646243     0.00     0.00  __gnu_cxx::new_allocator<std::_List_node<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::deallocate(std::_List_node<std::pair<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >*, unsigned long)
  0.52     88.53     0.57 114674805     0.00     0.00  void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag)
mingodad commented 2 years ago

And here is the output of peglint with peglib.h commented the DetectLeftRecursion and DetectInfiniteLoop, it's instantaneous:

/usr/bin/time peglint sql-parser-naked.peg test.sql
sql-parser-naked.peg:1361:2: 'sym_dblquote' is not referenced.
sql-parser-naked.peg:5:2: 'start_streaming' is not referenced.
sql-parser-naked.peg:1274:2: 'id_constraint_column' is not referenced.
sql-parser-naked.peg:1364:2: 'sym_backtick' is not referenced.
sql-parser-naked.peg:1874:2: '_TODO_' is not referenced.
sql-parser-naked.peg:1271:2: 'id_constraint_table' is not referenced.
sql-parser-naked.peg:1406:2: 'sym_bslash' is not referenced.
0.04user 0.00system 0:00.05elapsed 98%CPU (0avgtext+0avgdata 4988maxresident)k
0inputs+0outputs (0major+409minor)pagefaults 0swaps
mingodad commented 2 years ago

And here with only DetectInfiniteLoop commented, it's still instantaneous:

/usr/bin/time peglint sql-parser-naked.peg test.sql
sql-parser-naked.peg:1361:2: 'sym_dblquote' is not referenced.
sql-parser-naked.peg:5:2: 'start_streaming' is not referenced.
sql-parser-naked.peg:1274:2: 'id_constraint_column' is not referenced.
sql-parser-naked.peg:1364:2: 'sym_backtick' is not referenced.
sql-parser-naked.peg:1874:2: '_TODO_' is not referenced.
sql-parser-naked.peg:1271:2: 'id_constraint_table' is not referenced.
sql-parser-naked.peg:1406:2: 'sym_bslash' is not referenced.
0.06user 0.00system 0:00.06elapsed 100%CPU (0avgtext+0avgdata 5064maxresident)k
0inputs+0outputs (0major+408minor)pagefaults 0swaps
yhirose commented 2 years ago

The latest improves performance of DetectInfiniteLoop. Thanks for the fine report.

mingodad commented 2 years ago

Thank you it's working fine now for the example shown above ! But it seems that you forgot to rebuild the playground with this enhancement.

yhirose commented 2 years ago

Good catch! I just updated it.