vsbenas / parser-gen

A parser generator in Lua using PEG syntax.
MIT License
49 stars 9 forks source link

Skipping spaces after the matching of a recovery expressions #10

Closed sqmedeiros closed 7 years ago

sqmedeiros commented 7 years ago

Hi, @vsbenas .

Currently, after a recovery expression matches parser-gen does not automatically consumes the spaces after it, as it does for regular patterns described in the grammar. This causes some unexpected errors. For example, in the following code:

`package.path = package.path .. ";../?.lua" local pg = require "parser-gen" local peg = require "peg-parser" local errs = { rcblk = { "missing end of block", "(!'}' .)* '}'" }, condw = {"missing condition in while", "'b'"}, body = {"missing body statement in while", "'d'"}, } pg.setlabels(errs)

local grammar = pg.compile([[

prog <- blockstmt !. blockstmt <- '{' stmt '}'^rcblk stmt <- whilestmt / blockstmt whilestmt <- 'while' exp^condw stmt^body exp <- [0-9]+ HELPER <- ';' / %nl / %s / !. SYNC <- (!HELPER .) SKIP <- %s / %nl ]], _, false, false) grammar:pcode() local errors = 0 local function printerror(desc,line,col,sfail,trec) errors = errors+1 print("Error #"..errors..": "..desc.." on line "..line.."(col "..col..")") end

local function parse(input) errors = 0 result, errors = pg.parse(input,grammar,printerror) return result, errors end

if arg[1] then
-- argument must be in quotes if it contains spaces local input = io.open(arg[1]):read("*a") res, errs = parse(input) peg.print_t(res) peg.print_r(errs) end local ret = {parse=parse} return ret `

When given following input: `{ while 1 { };

}`

I was expecting an error message related to label rcblk, but I got: Error #1: Syntax error on line 5(col 1) nil [1] => { [msg] => 'Syntax error' [line] => '5' [col] => '1' }

I fixed this issue by changing the initial rule to: prog <- blockstmt %s !.

Then I got the expected error and a corresponding AST: Error #1: missing end of block on line 3(col 3) rule='prog', { rule='blockstmt', '{', { rule='stmt', { rule='whilestmt', 'while', { rule='exp', '1', }, { rule='stmt', { rule='blockstmt', '{', '}', }, }, }, }, }, [1] => { [msg] => 'missing end of block' [line] => '3' [label] => 'rcblk' [col] => '3' }

I think a function such as pattspaces should be applied to recovery expressions, in order to avoid the problem of handling spaces after the matching of a recovery expression.

vsbenas commented 7 years ago

Thanks @sqmedeiros Fixed and tested : https://github.com/vsbenas/parser-gen/commit/4b74750c676bcfd9989002cf5c4b093c0553e1d8