Closed ccastillo232 closed 1 month ago
Hey @ccastillo232,
the regex used /[a-zA-z]\w+/
is eager - as long as it doesn't encounter a delimiter (i.e. as long as there are more \w
characters to read) it will continue lexing until the end of the text. Internally, Chevrotain is just using the regex engine of the runtime and behaves just like any regex would on the input.
As such, this behavior is exactly within expectation. You will either need to limit your regex to be less eager or use delimiters in your input. Most languages just use whitespace for that ;)
Thank you for addressing this. It does make sense, although it presents me with a problem for my particular use case. I'll have to get creative.
Closing
I am seeing an issue where I can not get the Lexer to recognize a keyword over an identifier. I am working based on the example here: https://github.com/chevrotain/chevrotain/blob/master/examples/lexer/keywords_vs_identifiers/keywords_vs_identifiers.js
My test case is this:
This fails because it is recognizing only 1 Identifier, when I expect it the token vector to be ['text','while','text'].
I am using cevrotain version 10.5.0 so that I can test it with Jest.
Am I missing something?