DEMCON / plcdoc

Project to get PLC (Structured Text) documentation into Sphinx
https://plc-doc.readthedocs.io/latest
10 stars 1 forks source link

Special characters inside literal strings are not escaped by grammar #29

Open RobertoRoos opened 1 day ago

RobertoRoos commented 1 day ago

Following #28.

E.g.:

S_GET_CURR_MIN_MAX : STRING := 'CURR:LIM:NEG?;:CURR:LIM:POS?$L';
Philipp1297 commented 1 day ago

I have a potential fix for this problem:

Variable: CommentAny name=ID (',' ID) (address=Address)? ':' type=VariableType ((arglist=ArgList) | (AssignmentSymbol value=AssignmentValue))? ';' comment=CommentLine? ;

AssignmentSymbol: (':=') | ('REF=') ;

VariableType: (array=VariableTypeArray)? (pointer=PointerLike 'TO')? name=BaseType ;

AssignmentValue: ExpressionSemicolon | Expression | ArgList ;

ExpressionSemicolon: /'[^']*'/ ;

Explanation and Fixes

Issue with arglist and AssignmentSymbol: The problem seems to be in the Variable rule where either arglist or AssignmentSymbol can be matched. This creates ambiguity because only one of these should be valid at any given time.

Order of AssignmentValue: The second fix was to adjust the order of the AssignmentValue alternatives. By prioritizing ExpressionSemicolon, the parser can first check for a semicolon at the end of the expression. If no semicolon is found, it will proceed to evaluate the other options (Expression or ArgList).

Regex for Semicolons in Strings: Finally, the regex for ExpressionSemicolon was updated to allow semicolons within a string. This ensures that strings containing semicolons are parsed correctly without prematurely ending the expression.

Summary

These adjustments aim to resolve the ambiguity between arglist and AssignmentSymbol in the Variable rule, improve the handling of AssignmentValue, and properly account for semicolons within strings. I hope this helps!