ridencww / goldengine

Java implementation of Devin Cook's GOLD Parser engine
Other
35 stars 14 forks source link

Is there a built-in infrastructure to associate sourcecode comments to tokens? #18

Open codemanyak opened 6 years ago

codemanyak commented 6 years ago

For the reverse engineering facility in Structorizer (Nassi-Shneiderman diagram generation from source code parsing) we were interested in being able to associate identified source comments to the closest tokens. We didn't find such a possibility, though, and wrote an own workaround. Have we missed something? Would it be a helpful enhancement, otherwise? I add the respective GOLDParser subclass we wrote for this purpose (forget about the proprietary logging mechanism, which has nothing to do with it). It simply results in a hash map Token --> String (the protected field commentMap). In the diagram generator, we then defined a further map Reduction --> Token, which is derived from all non-terminal entries of the token-comment map, and we used to define language-specific sets of production rule IDs as stoppers for the actual association of the retrieved comment strings to meaningful syntactical units (diagram elements) where we had to avoid that all comments of substructure elements were also attached to their containing compound statements, but this is of course an application-specifc detail, briefly outlined in the class comment. AuParser.zip

If you decide to integrate the proposed comment retrieval infrastructure, it might be helpful to make the commentMap field public.

codemanyak commented 4 months ago

An improved version of the workaround is available here: AuParser_2024-04-16.zip