Open rns opened 9 years ago
Looks good. Is this a move away from the notation in Roberto's paper?
Roberto uses bare literals for terminals and notation like lpeg.V"A"
for non-terminals (V for variables) [1] -- his grammars don't have terminal symbols, unlike LUIF's structural grammars.
D2L (now) uses luif.S'symbol'
for symbols (both terminals and non-terminals), luif.L'literal'
for literals, and luif.C'[0-9]'
for charclasses.
Symbols get the most use in structural grammars, and sequences need stricter syntax -- so bare literals for symbols and luif.S{ }
for sequences simplifies both processing and syntax as I saw in early prototyping D2L.
[1] http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html#grammar
Done in 47cde97163523b16bf1cfb532857446fc639b13b, 869d4e134032e2b0aa29a29067768e80bb638694, 9e21535, https://github.com/rns/libmarpa-bindings/commit/999e31fb39fa83fd6a64fba339c22c71a58033c5
A side effect is that there is no function call and location is reported less accurately than, e.g., in the case of literals (luif.L()
), which can get line number via debug.getinfo()
.
This of course applies only to D2L used in Lua source files -- LUIF parser will be able to add full location objects as needed.
Ideas:
luif.R()
method to mark rules if their location in the source file is important. {...}
will also work -- example.That LPeg can find line-within-grammar is interesting. That code might be worth looking at for Kollos.
On Thu, May 21, 2015 at 11:06 PM, rns notifications@github.com wrote:
A side effect is that there is no function call and location is reported less accurately than, e.g., in the case of literals (luif.L()), which can get line number via debug.getinfo().
This of course applies only to D2L used in Lua source files -- LUIF parser will be able to add full location objects as needed.
Ideas:
- add optional luif.R() https://github.com/rns/libmarpa-bindings/blob/master/lua/kollos/luif.lua#L149 method to mark rules if their location in the source file is important. {...} will also work -- example https://github.com/rns/libmarpa-bindings/blob/master/lua/kollos/d2l.lua#L108 .
- imitate LPeg, which reports errors [at the line where the grammar starts]https://gist.github.com/rns/9b788e134d297ed83984.
— Reply to this email directly or view it on GitHub https://github.com/rns/kollos-luif-doc/issues/33#issuecomment-104530966.
I'm inclined to make
luif.S{ item, quantifier, separation_specificer, separator }
to define sequences (rather than symbols), abandonluif.Q()
and leave for symbols bare literals except for'|'
,'||'
,'%%'
,'~'
and'%'
. The rationale is clarity and good Huffman coding -- symbols are used most frequently -- and sequences need strict notation.With this, the LUIF calculator grammar will become:
vs. the current