Closed sprungknoedl closed 11 years ago
This behaviour has been inherited from Flex: http://flex.sourceforge.net/manual/Why-doesn_0027t-flex-have-non_002dgreedy-operators-like-perl-does_003f.html. In short, they recommend moving the logic from the lexer to the parser, or using start conditions (which nex does not support yet).
However, I'm willing to add non-greedy operators to nex if I can find the time. Stay tuned!
Thanks for your help :+1: . I moved said logic to goyacc, but the resulting code is not the prettiest. For my use case, a lot of code duplication arose because I'm only interested in the tokens inside PHP code.
I look forward to your changes :)
For anyone searching for the above broken link, here is a mirror:
http://www.cas.mcmaster.ca/~kahl/SE3E03/2006/flex/flex_82.html
(Since nex seems sort of unmaintained, I've resorted to looking at closed issues for hints! Let's all drop any learnings that we get to, somewhere like here!)
There is currently no possibility I could find to get the shortest possible match (non-greedy behaviour).
There should be a possibility to split the following snippet:
to these 2 matches:
and
Currently the regex
/<\?php.*\?>/
matches the whole text.Or did I simply miss something? Thanks