uwol / proleap-cobol-parser

ProLeap ANTLR4-based parser for COBOL
MIT License
136 stars 74 forks source link

Parser runs into no viable alternative exception - maybe nested performs with indexed variables ? #54

Closed Reinhard-Prehofer closed 6 years ago

Reinhard-Prehofer commented 6 years ago

Parsing the attached file leads to the following error

Cobolfile: a2600215.CBL threw exception: {}
java.lang.RuntimeException: syntax error in line 339:71 no viable alternative at input 'DISPLAY '------------------------------------------'-'
    at io.proleap.cobol.asg.runner.ThrowingErrorListener.syntaxError(ThrowingErrorListener.java:20)
    at org.antlr.v4.runtime.ProxyErrorListener.syntaxError(ProxyErrorListener.java:41)
    at org.antlr.v4.runtime.Parser.notifyErrorListeners(Parser.java:544)
    at org.antlr.v4.runtime.DefaultErrorStrategy.reportNoViableAlternative(DefaultErrorStrategy.java:282)
    at org.antlr.v4.runtime.DefaultErrorStrategy.reportError(DefaultErrorStrategy.java:121)
    at io.proleap.cobol.Cobol85Parser.ifThen(Cobol85Parser.java:31532)
    at io.proleap.cobol.Cobol85Parser.ifStatement(Cobol85Parser.java:31335)
    at io.proleap.cobol.Cobol85Parser.statement(Cobol85Parser.java:24801)
    at io.proleap.cobol.Cobol85Parser.performInlineStatement(Cobol85Parser.java:35565)
    at io.proleap.cobol.Cobol85Parser.performStatement(Cobol85Parser.java:35488)
    at io.proleap.cobol.Cobol85Parser.statement(Cobol85Parser.java:24857)
    at io.proleap.cobol.Cobol85Parser.sentence(Cobol85Parser.java:24440)
    at io.proleap.cobol.Cobol85Parser.paragraph(Cobol85Parser.java:24376)
    at io.proleap.cobol.Cobol85Parser.paragraphs(Cobol85Parser.java:24293)
    at io.proleap.cobol.Cobol85Parser.procedureDivisionBody(Cobol85Parser.java:24151)
    at io.proleap.cobol.Cobol85Parser.procedureDivision(Cobol85Parser.java:23223)
    at io.proleap.cobol.Cobol85Parser.programUnit(Cobol85Parser.java:880)
    at io.proleap.cobol.Cobol85Parser.compilationUnit(Cobol85Parser.java:783)
    at io.proleap.cobol.Cobol85Parser.startRule(Cobol85Parser.java:727)
    at io.proleap.cobol.asg.runner.impl.CobolParserRunnerImpl.parseFile(CobolParserRunnerImpl.java:190)
    at io.proleap.cobol.asg.runner.impl.CobolParserRunnerImpl.analyzeFile(CobolParserRunnerImpl.java:94)
    at de.dvzmv.fabea.infrastruktur.csi.parser.SingleCobolParser.parseFile(SingleCobolParser.java:56)
    at de.dvzmv.fabea.infrastruktur.csi.parser.SingleCobolParser.main(SingleCobolParser.java:140)
[a2600215.CBL.txt](https://github.com/uwol/cobol85parser/files/1575474/a2600215.CBL.txt)
uwol commented 6 years ago

Ok, this one is tricky.

055530             DISPLAY '------------------------------------------' 08.01.13
055540-                    '-------------------------------------'      08.01.13
055530             DISPLAY '------------------------------------------ 08.01.13
055540-                    '-------------------------------------'     08.01.13
Reinhard-Prehofer commented 6 years ago

I would prefer a solution where the input format is considered. In general all "my" provided samples are host/mainframe based and thus fixed column format: This would eliminate solution one, thus I would opt for solution 2. kind regards

uwol commented 6 years ago

Argh, gets more complicated. NIST test case ST135A shows

043500     IF      SORTOUT-NON-KEY-1 NOT EQUAL TO "                     ST1354.2
043600-    "              A" GO TO SORT-FAIL-1.                         ST1354.2

has to be processed to

           IF      SORTOUT-NON-KEY-1 NOT EQUAL TO "                                   A" GO TO SORT-FAIL-1.
055530             DISPLAY '------------------------------------------' 08.01.13
055540-                    '-------------------------------------'      08.01.13

has to be processed to

                   DISPLAY '------------------------------------------''-------------------------------------'

I'll have to search for a solution, that's more robust than number_of_quotes_in_line % 2 == 0 :-)

uwol commented 6 years ago

Ok, added a fix in 2cbdd8eada9fa58c2c66b14ec336c1c9fbe11471. This issue was quite complex, so the commit had to establish data structures for looking back to the last line and ahead to the next line. Based on that, logic had to be implemented, which takes several constellations into account. Hopefully, all relevant constellations are covered. All tests are green.

I'm going to re-test it with the example provided by you in this issue thread. If that file parses successfully, I'm going to close this thread.

However I see, that in #37 you provided a similar issue, which potentially is fixed also by 2cbdd8eada9fa58c2c66b14ec336c1c9fbe11471. I'll check that as part of #37.

uwol commented 6 years ago

Added a passing unit test similar to the one provided by you 047512dee785314430a52ad36aac977d97a91b19.

Thanks again!

Reinhard-Prehofer commented 6 years ago

YES - thanks, the parser works, though ... (sorry) Could you possibly add a BLANK between the literals, meaning display '=======' '================' instead of currently ('' are next to each other): display '=======''================' so that no to quotes appear close to each other => IBM-Compiler does not like that all too much

uwol commented 6 years ago

Ok, no problem. Added in ea1322a1c7b4a203191d21fa0091aefc4ad188ee. Now it is display '=======' '================'.

There could be other edge cases with a combination of numeric and string literals such as display 1'================'. I would suggest to wait, whether such cases provoke errors, and fix on demand. The rule set for line indicators is already complex enough :-)