Closed adMartem closed 2 months ago
This is a grammar that will fall into the infinite loop if the throw on line 232 is not present. When the input is section-name SECTION. sen-tence. paragraph-name. sentence. EOF.
, the infinite loop occurs if these commits are not applied (namely the one at line 232 to re-throw).
Funny story, when I did this PR I noticed in the changes that line 219 (${expansion.customErrorRecoveryBlock!}
) was one replaced by my new code that allowed recovery code blocks. I thought I had somehow erroneously added that line in an earlier commit and corrected it in this one, but when was it originally added? Well, it turns out it has been there for a long time and pre-dates any of my changes. I guess it was anticipated that a recovery code block was needed in this case, and that was put in the templates and Expansion
, but never used as far as I can tell. I think it can go away, since the recoveryBlock
is now present. I never noticed it during the ->{...}
changes since non-terminal re-sync was the last thing I looked at, and when I did, I thought it was a typo made by me.
Perhaps the seemingly unused RECOVER TO
production syntax is related to this? I think the solution is something in the area of "skip tokens until you get to one that could legally follow this production if it were correctly formed".
I've found that there are some unexpected (by me) side-effects of fault tolerant parsing in some areas of the COBOL parser where no re-syncing is involved. I haven't investigated these cases yet, but superficially they seem to involve skipping (invalidating) a token which is invalid in the current choice, but is valid (and meaningful) in a different choice. It is possible that the fault-tolerant token handling is just revealing a lookahead deficiency in the grammar, but I don't see how it compiles correctly when not in fault-tolerant mode in that case.
The failures seem unrelated - I would try rebasing and see if that resolves the test failures.
745de06 is unrelated to fault tolerence. For some reason I have not figured out it breaks my COBOL parser for about 0.5% of it's tests. My first impression is that the lookahead and actual scanning are somehow out of sync, but I'm still working on isolating what is happening. That is hampered by the fact that I cannot debug it in situ (the parser is too big), so I have to move generated code around so that what I'm interested in is below line 65536. Kind if tedious.
Closing this draft until I have time to work on it.
More specifically, (Temporarily) fix infinite loop in fault-tolerant recovery and add some comments to highlight need to revisit this area; allow "!" or "!-> {...}" for re-sync points, and fix some fault-tolerant bugs. Pay special attention to line 232 in
src/templates/java/ParserProductions.java.ftl
. I don't think this is the best fix, but I'm not sure. I have this feeling there is something missing that we really need to recover to the best place in the source, rather than punt here. It's like there is a follow set that should be skipped to somehow in this case.