Closed KvanTTT closed 7 years ago
Sorry I know this is an old/closed bug, but I had a question about this feature. Basically why? If a grammar is meant to be case-insensitive, shouldn't it be designed like that? Having this parameter, which either uppers or lower cases all the input during tests, seems like it would hide issues.
I've stumbled across this, because while testing out the MySqlLexer, it would refuse to accept the test input. I eventually discovered this was because the tests had all be run with the input converted to uppercase. IMHO the correct fix in that situation would be to change the MySqlLexer, not to change my input to be uppercase.
So is there a use case for this I'm missing? Thanks
Case insensitive words improve readability of grammar and have a positive influence on generated parser performance. Meanwhile, in the parser, you still can rely on symbols case. See @sharwell description: https://gist.github.com/sharwell/9424666#gistcomment-1186748.
For Go runtime you feel free to use my AntlrCaseInsensitiveInputStream.go.
Ah so simpler parsers and performance seem like good goals. Thanks for sharing your CaseInsensitiveInputStream, I wrote something similar yesterday that wraps any CharStream. CaseChangingStream.
However, requiring the case change makes the parser harder to use. It is not obvious to the user of the parser that they must do this. Documentation could be improved to fix this, but I would like to suggest an alternative. A option be placed in the g4 file, that changes the language target to either require the generated Lexer to require a CaseInsensitiveInputStream, or preferably just take a normal Input, and wrap it in a CaseChangingStream for example. Thoughts? (I'm happy to take this over to https://github.com/antlr/antlr4/ if better discussed there)
Such issue already exists: https://github.com/antlr/antlr4/issues/1002
Gah ok. So looks like you implemented what I was suggesting, but @parrt shot it down due to lack of time :( That's disappointing.
Maybe it's good :smile: My old implementation is not ideal. Now I think the best implementation should include a grammar and mode options. Case insensitive stream is used if respective option activated.
I want to use not only lower case. but UPPER CASE for case insensitive lexers (such as SQL languages). So, I suggest to use the following options for caseInsensitive parameter: