Closed sasq64 closed 4 years ago
@sasq64, thanks for the feedback. Could you give me the entire grammar that causes the freeze? I'll take a loot at it when I have time.
I have actually experienced such crash on the playground a couple of times. On both cases, my grammars had problem which caused infinite loop. So I made the following infinite loop detector to avoid the crashes. (You can also see some unit test cases with [infinite loop]
tag. in test/test.cc
.)
https://github.com/yhirose/cpp-peglib/blob/19c6c2615c0265f917746a2c75fd6cb39388fb9f/peglib.h#L1958
Anyway, you might have found another case that causes infinite loop that the the infinite loop detector currently cannot detect. Once I get your grammar, I'll look into it.
In order to fix the freeze on playground, if you are using Chrome, please go to chrome://settings/siteData
, find yhirose.github.io
, and clear the Local Storage data.
@sasq64, any response?
My grammar is pretty messy and I haven't reduced it to something simple yet. Will try tomorrow.
But I also notice that often when the grammar fails to match what it wants it goes into an infinite loop in my normal program?
So for instance the parsing never ends for the string :a:
with this grammar, when I run it in C++.
The playground doesn't freeze here though, it just returns no result. It doesn't complain on the grammar itself and it works for normal cases.
With the :a:
string, MetaArgs
doesn't match so it falls back to Program
which was not my intention, and it gets stuck in OpLine
which gets matched with 0 semantic arguments and keeps repeating...
Root <- RootExpression / RootDefinition / RootList / Program
RootExpression <- ':x:' Expression
RootDefinition <- ':d:' FnDef
RootList <- ':a:' MetaArgs
Program <- (Line (EOL / ''))*
Line <- (MacroCall / AssignLine / OpLine) _? LineComment?
AssignLine <- _? ('*' / Assignee) _? '=' Expression
Assignee <- Symbol _?
OpLine <- ~Label? _? (MacroCall / Test / Meta / Instruction)?
Instruction <- Opcode Arg?
MacroCall <- Call
Meta <- MetaName _ MetaText Block? (_ Symbol _ Block)*
MetaText <- (!(EOL / '}' / '{' / LineComment) .)*
MetaName <- '!' Symbol
MetaArgs <- MetaArg (_ ',' MetaArg)* ','?
MetaArg <- String / Expression / Call
Test <- "!test" _ TestName _ Block
TestName <- Symbol
FnDef <- FnName '(' FnArgs ')'
FnName <- Symbol
FnArgs <- (FnArg (',' FnArg)*)?
FnArg <- _? Symbol _?
Block <- EOL? _ '{' BlockContents '}'
BlockContents <- SkipProgram
SkipProgram <- (SkipLine EOL)* SkipLine?
SkipLine <- SkipLabel? _? (SkipMeta / SkipInstruction)? _? LineComment?
SkipLabel <- (_? DotSymbol ':') / (DotSymbol (_ / &EOL))
SkipInstruction <- (!(EOL / '}') .)*
SkipMeta <- MetaName _ (!(EOL / '}' / '{') .)* Block?
String <- _ ["] StringContents ["] _
StringContents <- (!["] .)*
Assign <- Symbol '=' Expression
Label <- (_? DotSymbol ':') / (DotSymbol (_ / &EOL))
Arg <- _ (LabelRef / Acc / Imm / IndY / IndX / Ind / AbsX / AbsY / Abs)
IndX <- '(' Expression (TailX0 / TailX1)
TailX0 <- ')' _? ',' _? 'X'i
TailX1 <- ',' _? 'X'i _? ')'
IndY <- '(' Expression (TailY0 / TailY1)
TailY0 <- ')' _? ',' _? 'Y'i
TailY1 <- ',' _? 'Y'i _? ')'
Ind <- '(' Expression ')' &(!Operator)
Acc <- 'a'i &(!Symbol)
Abs <- Expression
AbsX <- Expression ',' _? 'X'i
AbsY <- Expression ',' _? 'Y'i
Imm <- '#' Expression
LabelRef <- '-'+ / '+'+
Opcode <- Symbol
HexNum <- ('$' / '0x') [0-9a-fA-F]+
Octal <- '0o' [0-7]+
Binary <- '0b' [01]+
Decimal <- [0-9]+
Number <- HexNum / Binary / Octal / Decimal
LineComment <- ';' (!EOL .)* &EOL
~_ <- [ \t]*
Symbol <- [_A-Za-z] [_A-Za-z0-9]*
DotSymbol <- LabelChar / ([._A-Za-z] [_A-Za-z0-9]*)
LabelChar <- '-' / '+' / '$'
EOL <- '\r\n' / '\r' / '\n'
EOT <- !.
Expression <- Atom (Operator Atom)* {
precedence
L &&
L ||
L |
L ^
L &
L == !=
L < > <= >=
L <=>
L << >>
L - +
L / * % \
}
Atom <- _? (Star / Unary / Number /
FnCall / Index / Variable / '(' Expression ')') _?
Star <- '*'
Index <- Variable '[' Expression (':' Expression)? ']'
Unary <- UnOp Atom
UnOp <- ('!' / '~' / '-' / '<' / '>' )
FnCall <- Call
Call <- CallName '(' CallArgs ')'
CallName <- Symbol
CallArgs <- (CallArg (',' CallArg)*)?
CallArg <- String / Expression
Operator <-
'&&' / '||' / '<<' / '>>' / '==' / '!=' /
'+' / '-' / '*' / '/' / '%' / '\\' /
'|' / '^' / '&' / '<' / '>'
Variable <- '.'? (Symbol '.')* Symbol
@sasq64, you can get more information by tracing with peglint --trace
. Here are the steps to do it:
1) Save the grammar to a.peg
2) Run peglint a.peg --source ":a:" --trace
Hope it helps you to debug the grammar. The shortest grammar which causes the same problem is bellow:
S <- ''*
If we parse source text 'a' with the above grammar, it will fall into the same infinite loop as well. I'll try to find a way to detect such a grammar.
@sasq64, I changed cpp-peglib to detect such a grammar. If you put your grammar in the latest playground, you now see the error message "1:54 infinite loop is detected in 'Program'".
Thank you so much!
My problem seems to come from my (EOL / '')
rule... I have assumed that ''
means end-of-text, since I want to match also the last line in the file even if it has no EOL.
Is there another way to do this?
Actually I found https://github.com/xored/peg/blob/master/docs/grammar-tips.md which give some tips, and helped me understand that a loop that can match nothing is a problem, and the ''
must be moved out of the loop.
So
Program <- (Line (EOL))* (Line EOT)?
avoids it, except now the action of my last Line is called twice... (which is correct, but I generate output from actions so I need to figure out how to avoid it).
After a while the editor freezes and even reloading doesn't help.
I'm trying to implement recursive curly brace parsing so that
Can be parsed by something like
Have been trying many different ways to skip lines in the block but no luck so far. Any pointers?