Update of parser library to standard Unicon productions

brucerennie commented 3 years ago

This is the update of the parser library to use the statndard Unicon language productions. The grammar has been modified to incorporate a new semantic action for each production. The semantic actions are now placed in a separate file unigram.action.icn. The prefix and suffix code that exitsed in teh original grammar file has now been moved to their respective code files "unigram.prefix.icn" and "unigram.suffix.icn". This means that the same grammar file "unigram.y" can be used in both the parser and the Unicon compiler with the specific semantic actions required for each of the applications being localised to their specific action, prefix and suffix files.

This shows what can be achieved. The respective changes for the Unicon compiler will be loaded under another pull request when the testing on that has been completed.

In addition, the parser library class files have been updated to use the new parse tree that is created by the new format grammar, action, prefix and suffix files.

Signed-off-by: Bruce Rennie brennie@dcsi.net.au

brucerennie commented 3 years ago

I should note here that the parse tree handling routines and the parse tree display procedures have been extensively rewritten for the parser. Due to the changes I have made for the grammar processing, I have completely removed the original tree handling procedures that Robert Parlett had used in original parser class he had written. They did not sit well with the changes I made. So keep that in mind when looking at the code changes for each of the classes in the parser package.

The tree display procedures have been updated to give a variety of different formats. The one that I have chosen as default is suited for my comprehension of what the parse tree is displaying. There are 7 variations to choose from according to what might be more visually pleasing for the individual. The specific formats are described in the parser.icn file and can be selected by changing the second parameter to dump_tree. If this second parameter is not selected, I have defaulted the display to my preferred format. If this is not considered to be suitable, it change be changed to any of the other formats which includes the original display format.

I do intend submitting another pull request with the same fundamental changes to the grammar file in the next week or so. This pull request will not depend on this pull request being accepted. I am still working on the symbol table code, which is mostly written but still needs more thorough testing in both the Unicon compiler and the parser package.

brucerennie commented 3 years ago

I understand your concerns here. The reason for this approach is that it allows for a single generic grammar file to be defined that is common across any application that needs to analyse Unicon programs. For many of the productions, there is no requirement to change the semantic actions thus defined, especially those actions that are simply $$ := $1. However, due to the requirements of the iconc compiler and the analysis it requires, there are a number of semantic actions that were requiring very specific additions.

My first attempt at designing the semantic actions showed up some problems with the approach I was using in creating a common semantic action for multiple productions. After considering the needs of the parser package, the Unicon compiler (icont/iconc), I made a decision to just define an action for each and every production and allow the programmer to define whatever he/she wanted in terms of changing the relevant semantic action, while keeping the grammar file unchanged. It has put the process one level indirection level away from the grammar file.

The other thing I found was that all parameters for each production are passed to the semantic action and this allows the specific parse tree produced by the semantic action to be determined by the programmer outside of changing the grammar file. I am able to restructure the parse tree as I see fit via the semantic actions without affecting the grammar file.

Now, with the above said, this may not fit in with how other people do things and as such, if the specific structure I have created here doesn't suit then I do not have any problems it being changed. I just a long experience having had to support other people code and required major changes to keep it operational effective and I have found by experience that making code not have unexpected consequences is a long term maintenance feature. It ends up requiring less work to understand what will happen when changes are made.

I had a similar view when I made the changes to the parse tree scanning procedures that Robert Parlett had written. They worked with the specific parse tree structure he had implemented, but became difficult to use when the parse tree changed. What I am trying to aim for is a simplicity in the general structure of the code (or at least simplicity from my perspective). However, I do understand that what I consider simplicity and easily understandable is not going to be so considered by others. There is a place for "trickiness" and complexity, but in "normal" code I do try to avoid it or at least make it as uniform as I can.

uniconproject / unicon

Update of parser library to standard Unicon productions #177