Closed codemanyak closed 8 months ago
It turned out that the Structorizer hack does not cause the problem. It occurs even if the hack is disabled. This means, it's in the engine itself (which was derived by Ralph Iden from some open source version 5.0.0, whereas the GOLDBuilder represents a version 5.2.0 for which there is no source code publicly available). But the condition for the occurrence of the parsing failure is somewhat more complicated: It requires also a tabulator in the string literal before the comment symbol to raise the error! And then it applies to C sources as well, of course. (Pascal import is not affected, there tab characters are completely effaced from string literals - which isn't desirable, either, btw.)
It wasn't the comment symbol at all, but just the occurrence of a tab character in the string literal, that causes the failure. And the code passed the GOLDBuilder only because it automatically replaces the tab characters by blanks on loading the file 😠 . Hence, it's just the grammars that are to be blamed. Apparently the character set {all_printable} does not include tab as member.
Workaround: Replace tab characters in string literals by \t
.
Grammars for Java SE8 and ANSI-C99 now allow tab characters in string literals. Ready for version 3.32-19.
Java import fails if the source file contains string literals where
//
occurs as substring:The same happens if the string literal contains
/*
, no matter whether or not*/
follows within the string literal. Obviously, the comment detection does not correctly work with the GOLD engine used in Structorizer. Interestingly, the GOLDBuilder successfully parses the very same files using the very same grammar. Hence, it is possible that the Structorizer hack to associate comments to Productions during the parsing is to be blamed for the bug.