Open ratmice opened 2 years ago
From memory, yacc (or Bison maybe?) statically detects such cases and refuses to compile them (or, at least, issues a warning). If that memory is correct, we should consider doing something similar IMHO.
I wrote a gist that contains, some graphs and a breakdown of the issue, some potential ways we could go about fixing it.
the potential impacts of trying to do exactly what yacc/bison do here. I think the last potential fix mentioned (providing a mechanism to produce a pruned StateGraph
), seems like a nice way to maintain the existing relation shared between StateTable
and StateGraph
's StIdx
.
https://gist.github.com/ratmice/180cc7daa6e28a285885affd2e623f71
I mostly wanted to think through the semver ramifications of fixing this, and I think there are 2 options for fixing it with low impact to the code or semver, however changing the goto table the way yacc appears to seems to be quite impactful.
If you think that one of these low impact options is viable, I'm kind of inclined not to be in any rush to fix this,
since running a parser with conflicts, without %expect
, %expect-rr
is probably unwise, and I'm fairly convinced
this can only happen in the presence of conflicts.
Broadly speaking I'm in favour of making these errors (with the minor caveat that I'd like https://softdevteam.github.io/grmtools/master/book/errorrecovery.html#turning-lexing-errors-into-parsing-errors (where we use an otherwise-unused rule) to remain possible, but there might be a better way of doing it!). I find it hard to imagine anyone writes a grammar intending to leave such problems in.
I would think that perhaps a new declaration like %allow-unused Unmatched
could be used to disable such an error
(Or perhaps something which indicates its usage by error recovery?).
I'm mildly cognizant though of where it seems likely when you are initially writing a rule, you are less likely to have referenced within the rest of your grammar, which is kind of why I was thinking of the same way that conflicts()
and missing_from_lexer/missing_from_parser
returned from set_rule_ids
, in that those don't actually produce a StateTableError
, but are treated as warnings/errors by the individual tools.
One last thing, is I'm getting the impression that unused (due to conflicts)
it seems like is just a way of deriving which state of the conflict was dropped (And I think in yacc just Shift/Reduce
ones), I think it is basically able to be derived already from the information within Conflicts
, since I believe it returns the PIdx
of the production which was discarded during conflict resolution.
So it seems like we might have everything to produce that within the tools already.
I would think that perhaps a new declaration like %allow-unused Unmatched could be used to disable such an error
This feels quite yacc-y to me! I can certainly live with that.
[I must admit that the other parts of statetable generation / analysis have receded so far into the back of my brain that I can't remember how this edge-case stuff might work!]
Looking at this again, in the original report I had mentioned adding a case statement, I believe that catches some cases where we lack forward progress, but also misses some.
Because there ways to exhibit the same issue with mutual recursion where the indexes flip-flop back and forth.
That would go undetected just adding the Some(i) if i == usize::from(stidx) + 1 => None
case to goto.
One of the "fun" things about my project is running the parser on strange, half edited/incomplete changes. Here is one such case I encountered that way, and have minimized.
given the input character
a
, this will cause an infinite loop pushing topstack
between https://github.com/softdevteam/grmtools/blob/master/lrpar/src/lib/parser.rs#L297 https://github.com/softdevteam/grmtools/blob/master/lrtable/src/lib/statetable.rs#L461-L466 adding a case like:Some(i) if i == usize::from(stidx) + 1 => None,
togoto
fixes it, (i.e. the return value of goto ==prior
).Filing this as a bug report rather than sending a PR though, because I haven't yet tested it against valid parsers, or as of yet tried to surmise if this case can only and always lead to infinite recursion or if it ever actually comes up in a valid way.