sqmedeiros / lpeglabel

An extension of LPeg that supports labeled failures
MIT License
139 stars 18 forks source link

Q about error labels and labeled failures #24

Closed wqweto closed 5 years ago

wqweto commented 6 years ago

With labeled failures it is possible to distinguish between an ordinary failure and an error. Usually, an ordinary failure is produced when the matching of a character fails, and this failure is caught by ordered choice. An error (a non-ordinary failure), by its turn, is produced by the throw operator and may be caught by a recovery rule.

I just took the sample from the README and changed m.T to m.V and nothing changed, i.e. don't see the difference b/n throwing labels and referencing ErrRules with m.V.

This

local m = require'lpeglabel'

local recp = m.P"oast"

local g = m.P{
  'S',
  S    =  m.V'A' * '.',
  A    =  m.P't' * (m.P'est' + m.V'Err'),
  Err  =  m.P'oast'
}

print(g:match("test."))   --> 6
print(g:match("toast."))  --> 7
print(g:match("oast."))   --> nil  fail  oast.
print(g:match("toward.")) --> nil  fail  ward.

. . . produces this again

6
7
nil     fail    1
nil     fail    3

Why was necessary to introduce m.T? In what context does it differ from m.V? Can you explain with a sample code that shows the difference, please.

sqmedeiros commented 6 years ago

Hi. I am without a computer these days, so my answer may be short . There is a conceptual difference, m.T should be used to indicate an error. In case you only want error reporting, you do not need a recovery rule with the same name of a label. In case there is no recovery rule associated with a label 'l', the parser will finish the matching, it will not backtrack. This is not guaranteed in case you use m.V to report a regular failure. The implementation of LPegLabel allows to use m.V to refer to a recovery rule, but a different implementation of labeled PEGs may not allow this. In case you use m.T inside a predicate, the associated recovery rule will not be used, so the use of m.T should end the matching of this predicate.

wqweto commented 6 years ago

Thank you for the explanation. This backtracking that is not guaranteed by m.V would be very good to see a sample usage that demonstrates the difference. A sample that gives different results with m.T and m.V because of different backtracking.

Because as far as I can figure out if m.V'Err' fails this ends the matching with failure too -- by returning failure to parent expression (sequence, ordered choice, rule, etc.) which can return failure consequently or in the case of ordered choice continue with the next choice effectively swallowing the failure of the first choice.

A sample explaining the difference in backtracking would be most appreciated.

sqmedeiros commented 6 years ago

Sorry for the delay. I am back from my trip now. The example above uses m.T to signal an error when parsing rule A, you should notice that C does not match after the error is thrown. `g1T = m.P{ "S", S = m.V"A" * m.V"B"+ m.V"C", A = m.T"Err", B = m.P"b"^0, C = m.P(1)
}

print(g1T:match("bb")) --> nil Err 1`

Below, I am using m.V instead of m.T. I used the pattern -m.P(1) * m.P(1), which always fails, to indicate a failure. After the failure, the parser backtracks and tries to match C. `g1T = m.P{ "S", S = m.V"A" m.V"B"+ m.V"C", A = m.V'Err', B = m.P"b"^0, C = m.P(1), Err = -m.P(1) m.P(1)
}

print(g1T:match("bb")) --> 2`

Please, let me know whether this example suffices or not.

wqweto commented 6 years ago

Thank you for the sample! What I noticed in first g1T grammar is that the moment I add Err as to impl error recovery both grammars start behaving the same and return 2.

My conclusion is that in current lpeglabel m.T and m.V are having the same behavior when they reference existing rules (so called error recovery case) and that the difference is only when they reference non-existing rules -- m.T thows the error label, while m.V bombs with compile error like "rule '%s' undefined".

Am I right that this somewhat demystifies m.T -- one might as well use m.V when implementing error recovery?

sqmedeiros commented 6 years ago

You can also use m.V to implement error recovery, although the use of m.T instead of m.V makes more clear that a given rule is a recovery rule. There is a difference between m.T and m.V when you use predicates. In case you use m.T to throw a label l inside a predicate. the recovery rue associated with l will not be used, and the predicate will fail. In case you use m.Vthe recovery rule will still be used. See the examples below:

`local m = require"lpeglabel"

g1T = m.P{ "S", S = m.V"A" * m.V"B" + m.V"C", A = m.T"Err", B = m.P"b"^0, C = m.P(1), Err = m.P"", }

print(g1T:match("bb")) --> 3

g2T = m.P{ "S", S = #m.V"A" * m.V"B" + m.V"C", A = m.T"Err", B = m.P"b"^0, C = m.P(1), Err = m.P"", }

print(g2T:match("bb")) --> 2

g1V = m.P{ "S", S = m.V"A" * m.V"B" + m.V"C", A = m.V"Err", B = m.P"b"^0, C = m.P(1), Err = m.P"", }

print(g1V:match("bb")) --> 3

g2V = m.P{ "S", S = #m.V"A" * m.V"B" + m.V"C", A = m.V"Err", B = m.P"b"^0, C = m.P(1), Err = m.P"", }

print(g2V:match("bb")) --> 3`

wqweto commented 6 years ago

Ahaaa, that's a fine nuance that sets them apart -- good to know it. Thank you for the info!

I've been reading Exception Handling for Error Reporting in PEGs comparing to lpeglabel implementation. Seems like you abandoned some of the ideas (like the extra set of labels that the ordered choice should catch) wondering why. Is it that the idea proved to be unnecessary in practice or just very hard to implement in current VM?

sqmedeiros commented 6 years ago

We refined some of the ideas. A more recent paper focusing on error recovery, and that is closer to the current implementation (version 1.5) of LPegLabel is this one: https://arxiv.org/abs/1806.11150 https://dl.acm.org/citation.cfm?doid=3167132.3167261

The labeled ordered choice seemed more useful as a prediction mechanism than as an error recovery mechanism, since that we could not use it to get the exact error location and the surrounding context.

wqweto commented 6 years ago

Oh, and I somehow missed this new paper or just didn't realize these were different (new) PDFs. Anyway, I shamelessly hoisted the mini-java grammar for my testing purposes :-))

Hope you don't mind my rudimentary (and buggy) reimplementation of the label throwing technique in the PEG parser generator I've been fiddling recently. I'm somewhat reluctant to use try/catch of the target language (VB6) so currently my m.T is a syntax sugar for m.V and it seems good enough for my basic purposes.

sqmedeiros commented 6 years ago

Not at all, feel free to adapt the technique for your needs. Your question about m.V and m.T was clever.