Closed agentzh closed 7 years ago
Seeing it again in another Pegex-based parser of mine, for the Edge language. This has been bothering me ;)
@agentzh do these parses eventually complete? How long do they take to complete (vs number of warnings).
I've been thinking lately on how to minimize the recursion.
Can you write me a minimal test case that triggers the warnings?
@ingydotnet Yes it completes successfully despite those warnings and it does not seem to take unusually long to complete. There are in fact many other scripts taking much longer but complete without any warnings.
Below is a smallest example that I could ever get:
use strict;
use warnings;
use Pegex;
open my $in, "a.pgx" or die $!;
my $grammar = do { local $/; <$in> };
close $in;
open $in, "a.in" or die $!;
my $input = do { local $/; <$in> };
close $in;
pegex($grammar)->parse($input);
prog: - statement*
statement:
| assignment terminator
terminator: /-
';'
-/
assignment:
variable
/- ':=' -/
union-query
integer: /(
DASH? DIGIT+ \b
)/
union-query:
| union-all-relations
| union-select-query
union-select-query:
select-query
union-all-relations:
relation+ % /- \b 'union' + 'all' \b -/
variable:
| array-variable
| scalar-variable
identifier: /
\b (! 'select' \b ) ( ALPHA WORD* )
/
scalar-variable:
'$'
variable-name
trait*
trait:
/- ':local' -/
array-variable:
'@'
variable-name
trait*
variable-name:
identifier
table:
<symbol>1-3 % '.'
location:
| scalar-variable
relation:
| array-relation
| aliased-union-query
| function-relation
| aliased-table
aliased-table:
table alias?
select-query:
/- \b 'select' \b -/
select-arg+ % comma
from-clause?
where-clause?
group-by-clause?
symbol:
| scalar-variable
| identifier
array-relation:
array-variable alias?
aliased-union-query:
/ '(' -/
union-query
/- ')' /
alias?
function-relation:
function-call
alias?
alias:
| as identifier
| as scalar-variable
as:
/- \b 'as' \b -/
select-arg:
| aliased-arithmetic-expression
| match-all
aliased-arithmetic-expression:
arithmetic-expression alias?
match-all:
'*'
from-clause:
/- \b 'from' \b -/
data-source
where-clause:
/- \b 'where' \b -/
or-expression
group-by-clause:
/- \b 'group' + 'by' \b -/
column+ % comma
unsigned-integer: /(
DIGIT+
)/
function-call:
function
/- '(' -/
select-arg* % comma
/- ')' /
arithmetic-expression:
term+ % add-op
add-op: /-
( [-+] )
-/
term:
factor+ % mul-op
mul-op: /-
( ['*/'] )
-/
factor:
| /- '(' -/
arithmetic-expression
/- ')' -/
| atom
atom:
| value
| function-call
| qualified-column
| scalar-variable
| column
value:
number
number:
| real
| integer
real: /(
DASH?
DIGIT+
DOT
DIGIT+
\b
)/
data-source:
| explicit-join-query
| cross-join-query
| union-query
| relation
explicit-join-query:
relation
/- 'join' +/
relation
/- 'on' -/
and-expression
cross-join-query:
<relation>2+ % comma
or-expression:
and-expression+ % /+ 'or' +/
and-expression:
relational-expression+ % /+ 'and' +/
relational-expression:
| between-condition
| binary-relational-condition
between-condition:
arithmetic-expression
/- \b 'between' \b -/
arithmetic-expression
/- \b 'and' \b -/
arithmetic-expression
binary-relational-condition:
arithmetic-expression
comparison-op
arithmetic-expression
comparison-op: /-
( '>='
| '='
)
-/
qualified-column:
<symbol>2-3 % '.'
column:
<symbol>1-3 % '.'
function:
<symbol>1-2 % '.'
ws: /
(: BLANK
| EOL
)
/
line: /
ANY* EOL?
/
comma: /-
','
-/
@result :=
select finclick, finprice, coll_num, foo_amt, foo_num, avg_finprice,
efficiency, roc
from (
select $group,
sum(finclick) as finclick, sum(finprice) as finprice,
sum(item_coll_num) + sum(shop_coll_num) as coll_num,
sum(foo_direct_amt) + sum(foo_indirect_amt) as foo_amt,
sum(foo_direct_num) + sum(foo_indirect_num) as foo_num,
ifnull(round(sum(finprice) / sum(finclick),4), 0) as avg_finprice,
ifnull(round(sum(finclick) * sum(finclick) / sum(finprice), 4), 0) as efficiency,
ifnull(round((sum(foo_direct_num) + sum(foo_indirect_num)) / sum(finclick), 4), 0) as roc
from A as a
where is_shop = $is_shop and campaign_id = $campaign_id
group by $group
) as a;
And running the a.pl
script above gives
$ perl a.pl
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 187.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 187.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 206.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 206.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 187.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 187.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 187.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153.
Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153.
Please let me know if you need any further information. Thanks a lot!
Figuring this out should be as simple as turning on Pegex debugging and adding a print statement to each of the subs in Pegex::Parser. Print the subname and a counter. I suspect you are triggering recurses that go just over the warning limit.
I'm travelling all day today, but can look at it after.
On Wed, May 6, 2015 at 12:02 AM, Yichun Zhang notifications@github.com wrote:
@ingydotnet https://github.com/ingydotnet Yes it completes successfully despite those warnings and it does not seem to take unusually long to complete. there are other scripts taking much longer but complete without any warnings.
Below is a smallest example that I could ever get: a.pl
use strict;use warnings;use Pegex; open my $in, "a.pgx" or die $!;my $grammar = do { local $/; <$in> };close $in; open $in, "a.in" or die $!;my $input = do { local $/; <$in> };close $in;
pegex($grammar)->parse($input);
a.pgx
prog: - statement*
statement: | assignment terminator
terminator: /- ';' -/
assignment: variable /- ':=' -/ union-query
integer: /( DASH? DIGIT+ \b )/
union-query: | union-all-relations | union-select-query
union-select-query: select-query
union-all-relations: relation+ % /- \b 'union' + 'all' \b -/
variable: | array-variable | scalar-variable
identifier: / \b (! 'select' \b ) ( ALPHA WORD* ) /
scalar-variable: '$' variable-name trait*
trait: /- ':local' -/
array-variable: '@' variable-name trait*
variable-name: identifier
table:
1-3 % '.' location: | scalar-variable relation: | array-relation | aliased-union-query | function-relation | aliased-table aliased-table: table alias? select-query: /- \b 'select' \b -/ select-arg+ % comma from-clause? where-clause? group-by-clause? symbol: | scalar-variable | identifier array-relation: array-variable alias? aliased-union-query: / '(' -/ union-query /- ')' / alias? function-relation: function-call alias? alias: | as identifier | as scalar-variable as: /- \b 'as' \b -/ select-arg: | aliased-arithmetic-expression | match-all aliased-arithmetic-expression: arithmetic-expression alias? match-all: '*' from-clause: /- \b 'from' \b -/ data-source where-clause: /- \b 'where' \b -/ or-expression group-by-clause: /- \b 'group' + 'by' \b -/ column+ % comma unsigned-integer: /( DIGIT+ )/ function-call: function /- '(' -/ select-arg\* % comma /- ')' / arithmetic-expression: term+ % add-op add-op: /- ( [-+] ) -/ term: factor+ % mul-op mul-op: /- ( ['*/'] ) -/ factor: | /- '(' -/ arithmetic-expression /- ')' -/ | atom atom: | value | function-call | qualified-column | scalar-variable | column value: number number: | real | integer real: /( DASH? DIGIT+ DOT DIGIT+ \b )/ data-source: | explicit-join-query | cross-join-query | union-query | relation explicit-join-query: relation /- 'join' +/ relation /- 'on' -/ and-expression cross-join-query: 2+ % comma or-expression: and-expression+ % /+ 'or' +/ and-expression: relational-expression+ % /+ 'and' +/ relational-expression: | between-condition | binary-relational-condition between-condition: arithmetic-expression /- \b 'between' \b -/ arithmetic-expression /- \b 'and' \b -/ arithmetic-expression binary-relational-condition: arithmetic-expression comparison-op arithmetic-expression comparison-op: /- ( '>=' | '=' ) -/ qualified-column: 2-3 % '.' column: 1-3 % '.' function: 1-2 % '.' ws: / (: BLANK | EOL ) / line: / ANY\* EOL? / comma: /- ',' -/ a.in @result := select finclick, finprice, coll_num, foo_amt, foo_num, avg_finprice, efficiency, roc from ( select $group, sum(finclick) as finclick, sum(finprice) as finprice, sum(item_coll_num) + sum(shop_coll_num) as coll_num, sum(foo_direct_amt) + sum(foo_indirect_amt) as foo_amt, sum(foo_direct_num) + sum(foo_indirect_num) as foo_num, ifnull(round(sum(finprice) / sum(finclick),4), 0) as avg_finprice, ifnull(round(sum(finclick) \* sum(finclick) / sum(finprice), 4), 0) as efficiency, ifnull(round((sum(foo_direct_num) + sum(foo_indirect_num)) / sum(finclick), 4), 0) as roc from A as a where is_shop = $is_shop and campaign_id = $campaign_id group by $group ) as a; And running the a.pl script above gives $ perl a.pl Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 187. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 187. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 206. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 206. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 187. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 187. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 187. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153. Deep recursion on subroutine "Pegex::Parser::match_next" at /usr/local/Cellar/perl/5.20.1/lib/site_perl/5.20.1/Pegex/Parser.pm line 153. — Reply to this email directly or view it on GitHub https://github.com/ingydotnet/pegex-pm/issues/43#issuecomment-99347977.
I am also having this problem at https://github.com/pdl/jtl/blob/7e069e96543fa24b2a23df167eacbb0e1ad20161/poc/t/Whole.t ( Grammar at https://github.com/pdl/jtl/blob/7e069e96543fa24b2a23df167eacbb0e1ad20161/poc/share/jtls.pgx ). The parse completes fine and is not slow.
I've looked at the Pegex debugging output, and indeed it looks like it's triggered when it exceeds some limit - though I suspect that going over would be a more regular occurrence with more deeply nested inputs.
Is there some way for a grammar which knows it is dealing with a reasonably deeply-nested language to ask Pegex::Parser to suppress this warning? Or if not at the moment, would you condone a PR which permitted this?
PR certainly welcome!
Join #pegex on freenode to discuss
Closing this because this seems to be fixed by #46
After porting my LZSQL parser based on Parse::RecDescent over to Pegex, I've noticed the much deeper recursion levels in the Pegex-based parser, as evidenced by the following Perl-land on-CPU flame graph for my LZSQL parser:
http://agentzh.org/misc/flamegraph/perl-land-pegex-lzsql-parser.svg
Also, perl sometimes complaining about it too, which never happened with the old Parser::RecDescent parser:
As you can see, the
lightface/p4pv2/effect_sum.lzsql
file triggers this.