sebastianriese / pyLR1

A pure python LR(1)/LALR(1) parser generator
7 stars 3 forks source link

Show symbols which are acceptable in the state when emitting a syntax error #23

Closed horazont closed 7 years ago

horazont commented 7 years ago

example grammar:

%lexer

%def
    space [\ \t\n]

{space}+  %restart
\+        OP_PLUS
\*        OP_STAR
/         OP_SLASH
-         OP_MINUS
\(        PAREN_OPEN
\)        PAREN_CLOSE
[0-9]+    INTEGER

%parser

%left OP_PLUS OP_MINUS
%left OP_SLASH OP_STAR

expression:
  value:
    $$.sem = $1.sem
  expression OP_PLUS value:
    $$.sem = ('+', $1.sem, $3.sem, $$.pos)
  expression OP_STAR value:
    $$.sem = ('*', $1.sem, $3.sem, $$.pos)
  expression OP_MINUS value:
    $$.sem = ('-', $1.sem, $3.sem, $$.pos)
  expression OP_SLASH value:
    $$.sem = ('/', $1.sem, $3.sem, $$.pos)

value:
  INTEGER:
    $$.sem = ('int', int($1.sem), $1.pos)
  PAREN_OPEN expression PAREN_CLOSE:
    $$.sem = $2.sem

Example output (for input 1+):

0 # (0, 4) INTEGER "1"
0 4 # (1, 5) OP_PLUS "+"
0 6 # (1, 0) OP_PLUS "+"
0 1 # (0, 2) OP_PLUS "+"
0 1 2 # (2, 0) $EOF ""
Traceback (most recent call last):
  File "testparser.py", line 31, in <module>
    ast = parser.Parse()
  File "parser.py", line 324, in Parse
    for _, name in self.ntable[self.stack[-1].state]
  File "parser.py", line 228, in error
    raise SyntaxError(message=msg, position=pos)
<run_path>.SyntaxError: test.input Line 2:0 - 2:0:syntax error: expected one of: value

Example output (for input (2):

0 # (0, 5) PAREN_OPEN "("
0 5 # (0, 4) INTEGER "2"
0 5 4 # (1, 5) $EOF ""
0 5 6 # (1, 0) $EOF ""
0 5 7 # (2, 0) $EOF ""
Traceback (most recent call last):
  File "testparser.py", line 31, in <module>
    ast = parser.Parse()
  File "parser.py", line 324, in Parse
    for _, name in self.ntable[self.stack[-1].state]
  File "parser.py", line 228, in error
    raise SyntaxError(message=msg, position=pos)
<run_path>.SyntaxError: test.input Line 1:0 - 1:0:syntax error: expected one of: OP_SLASH, PAREN_CLOSE, OP_STAR, OP_PLUS, OP_MINUS
horazont commented 7 years ago

Oops, this pull request have been against the support branch. The commit is based on support.

horazont commented 7 years ago

More examples:

Using:

#!/usr/bin/python3
import io
import sys
import runpy
import pprint

mod_parser = runpy.run_path(sys.argv[1])

with open(sys.argv[2], "rb") as f:
    lines = list(f)

asts = []
for i, line in enumerate(lines):
    print("PARSE {}: {}".format(i, line))
    try:
        lexer = mod_parser["Lexer"](
            io.BytesIO(line),
            filename=sys.argv[2]
        )
        parser = mod_parser["Parser"](lexer)
        lexer.parser = parser
        ast = parser.Parse()
    except mod_parser["SyntaxError"] as exc:
        print("SYNTAX ERROR: {}".format(exc))
        ast = None
    except mod_parser["Incomplete"] as exc:
        print("INCOMPLETE INPUT: {}".format(exc.next_states))
        ast = None
    else:
        pprint.pprint(ast)
    asts.append(ast)
    print()

for i, ast in enumerate(asts):
    print("AST {}:".format(i))
    pprint.pprint(ast, indent=4)
    print()

and the input:

1
1+
1+1
1+(
1+(2
1+(2+)

The following output is generated:

PARSE 0: b'1\n'
0 # (0, 4) INTEGER "1"
0 4 # (1, 5) $EOF ""
0 6 # (1, 0) $EOF ""
0 1 # (1, 7) $EOF ""
('int', 1, <<run_path>.Position object at 0x7fd562b60d68>)

PARSE 1: b'1+\n'
0 # (0, 4) INTEGER "1"
0 4 # (1, 5) OP_PLUS "+"
0 6 # (1, 0) OP_PLUS "+"
0 1 # (0, 2) OP_PLUS "+"
0 1 2 # (2, 0) $EOF ""
SYNTAX ERROR: test.input Line 2:0 - 2:0:syntax error: expected one of: value

PARSE 2: b'1+1\n'
0 # (0, 4) INTEGER "1"
0 4 # (1, 5) OP_PLUS "+"
0 6 # (1, 0) OP_PLUS "+"
0 1 # (0, 2) OP_PLUS "+"
0 1 2 # (0, 4) INTEGER "1"
0 1 2 4 # (1, 5) $EOF ""
0 1 2 3 # (1, 1) $EOF ""
0 1 # (1, 7) $EOF ""
('+',
 ('int', 1, <<run_path>.Position object at 0x7fd562b611d0>),
 ('int', 1, <<run_path>.Position object at 0x7fd562b61278>),
 <<run_path>.Position object at 0x7fd562b619e8>)

PARSE 3: b'1+(\n'
0 # (0, 4) INTEGER "1"
0 4 # (1, 5) OP_PLUS "+"
0 6 # (1, 0) OP_PLUS "+"
0 1 # (0, 2) OP_PLUS "+"
0 1 2 # (0, 5) PAREN_OPEN "("
0 1 2 5 # (2, 0) $EOF ""
SYNTAX ERROR: test.input Line 2:0 - 2:0:syntax error: expected one of: expression

PARSE 4: b'1+(2\n'
0 # (0, 4) INTEGER "1"
0 4 # (1, 5) OP_PLUS "+"
0 6 # (1, 0) OP_PLUS "+"
0 1 # (0, 2) OP_PLUS "+"
0 1 2 # (0, 5) PAREN_OPEN "("
0 1 2 5 # (0, 4) INTEGER "2"
0 1 2 5 4 # (1, 5) $EOF ""
0 1 2 5 6 # (1, 0) $EOF ""
0 1 2 5 7 # (2, 0) $EOF ""
SYNTAX ERROR: test.input Line 2:0 - 2:0:syntax error: expected one of: OP_MINUS, OP_SLASH, PAREN_CLOSE, OP_PLUS, OP_STAR

PARSE 5: b'1+(2+)'
0 # (0, 4) INTEGER "1"
0 4 # (1, 5) OP_PLUS "+"
0 6 # (1, 0) OP_PLUS "+"
0 1 # (0, 2) OP_PLUS "+"
0 1 2 # (0, 5) PAREN_OPEN "("
0 1 2 5 # (0, 4) INTEGER "2"
0 1 2 5 4 # (1, 5) OP_PLUS "+"
0 1 2 5 6 # (1, 0) OP_PLUS "+"
0 1 2 5 7 # (0, 2) OP_PLUS "+"
0 1 2 5 7 2 # (2, 0) PAREN_CLOSE ")"
SYNTAX ERROR: test.input Line 1:5 - 1:6:syntax error: expected one of: value

AST 0:
('int', 1, <<run_path>.Position object at 0x7fd562b60d68>)

AST 1:
None

AST 2:
(   '+',
    ('int', 1, <<run_path>.Position object at 0x7fd562b611d0>),
    ('int', 1, <<run_path>.Position object at 0x7fd562b61278>),
    <<run_path>.Position object at 0x7fd562b619e8>)

AST 3:
None

AST 4:
None

AST 5:
None