eclipse-che4z / che-che4z-lsp-for-cobol

COBOL Language Support provides autocomplete, highlighting and diagnostics for COBOL code and copybooks
Other
83 stars 58 forks source link

Better parser error recovery #1263

Open zimlu02 opened 2 years ago

zimlu02 commented 2 years ago

Is your feature request related to a problem?

Currently, if the parser runs into an unexpected token it stops and is unable to continue parsing the rest of the code.

Describe the solution you'd like

Come up with a strategy on how to gracefully ignore unrecognized syntax and continue parsing.

For an example, see 1) in the issue: https://github.com/BroadcomMFD/cobol-control-flow/issues/3#issuecomment-1034141980

GitMensch commented 1 year ago

Another example is the use of unknown level numbers:

       IDENTIFICATION DIVISION.
       PROGRAM-ID.             TESTNC02.
       ENVIRONMENT DIVISION.
       CONFIGURATION SECTION.
       SOURCE-COMPUTER. IBM-PC  WITH DEBUGGING MODE.
       OBJECT-COMPUTER. IBM-PC.
       DATA DIVISION.
       WORKING-STORAGE SECTION.

       78 COMPARE      VALUE     'COMPARE_DISPLAY_TO_DISPLAY2'.
       77 ANSWER       PIC X.
       77 NUM-FLD      PIC S9(35)v9.

       01 FLD-X7.
        89 ALL-SPACE VALUE SPACES.
      *  03 FLD-97     PIC S9(5)v9(32).
         03 FLD-97     PIC S9(02)V9(35).
      *  03 FLD-97     PIC  9(3)           SIGN LEADING SEPARATE.

       01  TMP.
           05  TMP-DOUBLE                  BINARY-DOUBLE UNSIGNED.
           05  TMP-MILLISECONDS            BINARY-LONG.
           05  TMP-DISPLAY-LONG            PIC ZZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZ9.

results in

[{
    "resource": "/C:/Temp/TESTNUM1.CBL",
    "owner": "_generated_diagnostic_collection_name_#1",
    "severity": 8,
    "message": "Syntax error on '78' expected {CBL, END, EXEC, FILE, ID, IDENTIFICATION, LINKAGE, LOCAL-STORAGE, PROCEDURE, PROCESS, WORKING-STORAGE, '01-49', '66', '77', '88'}",
    "source": "COBOL Language Support (parsing)",
    "startLineNumber": 10,
    "startColumn": 8,
    "endLineNumber": 10,
    "endColumn": 10
}]

and no further parsing is done (the layout shows an empty WORKING-STORAGE.

This very specific example could be worked around by splitting the level numbers from their value - if values [0]1 to 99 (or integers in general) would be checked there, then it would be possible to raise an error message "invalid level number".

If the level 78 is commented out then the same happens on the "typo" for 88->89; the parser just stops any further parsing.