uwol / proleap-cobol-parser

ProLeap ANTLR4-based parser for COBOL
MIT License
136 stars 74 forks source link

EXEC SQL sections are removed in the parsed program #6

Closed albertolovato closed 7 years ago

albertolovato commented 7 years ago

I tried printing the text of a compilation unit retrieved using getCtx().getText(), and the EXEC SQL sections are completely ignored. Is this a missing feature of the parser? Or is there another way to retrieve the embedded SQL statements?

Thank you

uwol commented 7 years ago

Correct, the current implementation of the COBOL preprocessor deletes EXEC SQL statements from the output stream (cf. CobolPreprocessorImpl.java).

However I could commit an implementation, which conserves EXEC SQL statements by converting them to comment lines. E.g. EXEC SQL chardata END-EXEC would be converted to something like >*EXEC SQL chardata END-EXEC in the COBOL preprocessor output stream, so that the COBOL parser does not try to parse SQL statements as COBOL statements.

Can you please provide an exemplary COBOL snippet including the EXEC SQL statement you are trying to parse? That would help me to write a unit test.

albertolovato commented 7 years ago

Thank you, but aren't comments also removed in the parsed code?

Anyway, I tried an example from tutorialspoint:

IDENTIFICATION DIVISION. PROGRAM-ID. HELLO.

DATA DIVISION. WORKING-STORAGE SECTION. EXEC SQL INCLUDE SQLCA END-EXEC.

EXEC SQL INCLUDE STUDENT END-EXEC.

EXEC SQL BEGIN DECLARE SECTION END-EXEC. 01 WS-STUDENT-REC. 05 WS-STUDENT-ID PIC 9(4). 05 WS-STUDENT-NAME PIC X(25). 05 WS-STUDENT-ADDRESS X(50). EXEC SQL END DECLARE SECTION END-EXEC.

PROCEDURE DIVISION. EXEC SQL SELECT STUDENT-ID, STUDENT-NAME, STUDENT-ADDRESS INTO :WS-STUDENT-ID, :WS-STUDENT-NAME, WS-STUDENT-ADDRESS FROM STUDENT WHERE STUDENT-ID=1004 END-EXEC.

IF SQLCODE=0 DISPLAY WS-STUDENT-RECORD ELSE DISPLAY 'Error' END-IF. STOP RUN.

uwol commented 7 years ago

The preprocessor now adds EXEC SQL as comments to the output stream (cf. CobolParserPreprocessorListenerImpl.java and ExecSqlMultiline.cbl.preprocessed).

Currently, those and all other comments are skipped by the COBOL parser (cf. Cobol85.g4). I will modify the grammar and parser this week, so that comments are processed on a hidden channel and can be accessed via the abstract semantic graph (ASG).

uwol commented 7 years ago

Ok, now it should work.

The preprocessor now extracts EXEC SQL data description entries from data divisions as well as EXEC SQL statements, EXEC SQLIMS statements and EXEC CICS statements from procedure divisions. In the preprocessor output stream they are marked by the line tags >*EXECSQL, >*EXECSQLIMSand >*EXECCICS (e.g. cf. ExecSql.cbl.preprocessed).

In the ASG the contents of those EXEC statements are provided as texts (cf. ExecSqlStatementMultilineTest.java). The linked unit tests shows how to access those texts, giving an answer to your original question.

Additional parsers would be required for parsing these textual EXEC SQL and EXEC CICS statements, e.g. for extracting table names. However, imo that is a different issue and I am not sure, if that feature is required, currently.

albertolovato commented 7 years ago

Thank you very much, it works perfectly. I don't think your parser should parse SQL code, there already exist SQL parsers for Java.