uwol / proleap-cobol-parser

ProLeap ANTLR4-based parser for COBOL
MIT License
149 stars 78 forks source link

Can't associate comments with program lines #75

Open mariusflorea1977 opened 5 years ago

mariusflorea1977 commented 5 years ago

Hi everybody ! I need a way to associate comments to the "statements" that follow them. I'm an expert in syntaxical analysis in a programmatic way, but I know almost nothing about ATLR grammar. Yous COBOL parser seems to be very complete, so I don't need to write one. I wonder if there could be a way to get COBOL program comments as regular "statements", so could can associate them with the "statements" and/or declarations folloginw them. As an example: `

 IDENTIFICATION DIVISION.  PROGRAM-ID. HelloWorld.  DATA DIVISION.  WORKING-STORAGE SECTION.  01 ITEMS.     02 ITEM1 PIC X(10). *Comment here.     02 ITEM2 PIC X(10). ` I widh I could associate the comment line with the data item declaration that follows it. If the comment line was processed as a regular item, let's say as a data declaration item in this particular case, I could associate them when browsing the definition of "ITEMS", getting "ITEM1", then the comment, then "ITEM2". I don't know if this is possible, and in that case, if it's easy to do...

P.S.: Excuse my poor English !!!

Reinhard-Prehofer commented 5 years ago

Hi Marius! I understand what (and of course also why) you want the comments to be associated to statements and declarations. You mention to be rather new to ANTLR - well ... you should study that mechanism and building blocks, it really pays off: https://github.com/uwol/proleap-cobol-parser/blob/master/src/main/antlr4/io/proleap/cobol/Cobol.g4 If you have a look at the definition of the grammar (in ebnf/antlr-format) you will see commentEntry? quite often, meaning: there could be an optional comment - and further down in the grammar: COMMENTENTRYTAG : ' > CE'; COMMENTTAG : ' > '; This means, that the positional column-7 asterisk () will be transformed to the ANSI comment > which works like a end-of-line comment // in Java or C. Thus internally all the comments will be transformed to ansi-comments. BUT then ... COMMENTLINE : COMMENTTAG WS ~('\n' | '\r')* -> channel(HIDDEN); this means, that comments in the current Cobol.g - grammar are not processed, or passed along to a hidden channel => thus you would have to change that definition in a copy of your Cobol.g for sure. But maybe Ulrich has a better hint for you. kind regards

mariusflorea1977 commented 5 years ago

OK, if I understand, every time we have a << commentEntry? >> somewhere into the grammar file, it can be mapped to a comment, associated with the "statement" ending with << commentEntry? >>. In order not to send comment lines to the hidden channel, I have to remove << -> channel(HIDDEN) >> from the concerned line. So, once done, could you tell me how I can get the comment of a "statement" in Java, in case it is present, using this technique ?!... If I could do that, it could avoid me at least weeks of work :) ... Thans you for yout support !!!!!!!

uwol commented 5 years ago

Hi Marius,

thanks for your interest! Reinhard ist right,

So to preserve comment lines, one would have to (1) direct comment lines to another channel and thus preserve them in the AST and then (2) optionally make them accessible in the ASG for convenient access.

mariusflorea1977 commented 5 years ago

Hi Ulrich ! Thank you for your support !!! I think I'm able to preserve comments by making the change I mention above, but I don't know how to make them available when browsing the ASG/AST (in fact, I know what an AST is, but I don't know what an ASG exactly is, and what is the difference between them. Could you please guide me in order to have a start point into the code to make comments available ?!... If not, I could associate the comments with the "statements" arround them if I can retrieve the source line of those statements and those of the comments. Any method is OK for me, but I couldn't be able to make them available until now... Couls you please help me a little more, just give me a start point :) ?!...

Reinhard-Prehofer commented 5 years ago

Hi all, Changing the behaviour of how comments are treated, is not rocket-science, but it's more than just a view hints as of how and what to do ... I am not the antlr-guru, but it would take me a couple of days to implement such a feature decently. I really recommend that you dive a bit into antlr and it's mechanisms. There are couple of good ebooks around (costing approx 40Euro) - and they are worth it. Have a look at Frederico Tomassetti's book and samples on antlr-4: https://tomassetti.me/antlr-mega-tutorial/ ... there is a chapter about handling comments and channels in it. https://tomassetti.me/antlr-mega-tutorial/

uwol commented 5 years ago

Yes, that would take me a few days, too. Because of that effort it is not implemented, yet 😃If you are interested, we could add that feature, but please would have to share the costs -> very small project.

mariusflorea1977 commented 5 years ago

Thank you for your support ! For the moment, I think that we (we are a community) don't have the money for it, so we will try to do it by ourselves. Maybe later, if we fail, and if we have some more money, we will participate to the costs. But before further investigation, how much money do you need from us to implement this functionnality :) ?