Open BjAlvestad opened 1 year ago
Note that same applies to REGION
Sort of solved by using the following terminals:
terminal REGION_TEXT: /REGION [^\r\n]*/;
terminal TITLE_TEXT: /TITLE = [^\r\n]*/;
Also tried solving with positive look-behind so that REGION
could be a keyword, but ran into some weird issues with keyword clashes when doing so (ref. commit 625435fd114de991d4b752149511dacb62af0652):
This is a workaround for the issue where parsing/lexing of a region would fail if the region text started with a word that contained a keyword. E.g. Interval ... would cause it to fail, since INT is a keyword (and we have switched off case sensitivity). Note that Some Interval ... would not cause it to fail, since it was only an issue if it was the start of the terminal.
This commit makes REGION and TITLE = part of the terminal, instead of using positive lookbehind.
The issue with keyword clashes is due to keywords being being evaluated before terminals.
Could possibly override DefaultTokenBuilder
to evaluate some keywords after some terminals, but before the ID terminal (since any keyword would also be a valid ID).
This would make it possible to parse region and title without including the keyword in the terminal. And would also help parsing comments in XML that starts with keyword, without having to do similar tricks there.
The issue with keyword clashes is due to keywords being being evaluated before terminals.
Could possibly override
DefaultTokenBuilder
to evaluate some keywords after some terminals, but before the ID terminal (since any keyword would also be a valid ID).This would make it possible to parse region and title without including the keyword in the terminal. And would also help parsing comments in XML that starts with keyword, without having to do similar tricks there.
Has been overridden to add support for nested multi-line comments, so may be an easier task to use this for Title and Region as well now. However, since the current parsing works, this is a ;pw priority task.
In TIA Portal SCL the title is defined as a string without any quotes
Presumably this can then NOT be spread over multiple lines, and is therefore parsed based on first newline. However our grammar eats up all whitespaces (including newline) to simplify in situations where whitespace does not matter.
Need to find way to parse TIA Portal style block titles.