Closed florianschanda closed 6 months ago
I'm not able to deep-dive into this, but there's not a whole lot to the PLY lexer other than repeatedly calling re.match()
. That said, the re
module is known to have pathological performance for certain kinds of regex. It's possible that you've somehow fallen into that by accident. Fixing it will require some further investigation.
OK Thanks for the fast response. I can try figure out if it's really a pathological regex as such, or if the way PLY combines the regexes makes it pathological.
Aside from your performance issue, there is a logical problem in your code. If you want to implement lexical analyzer for templates in C++ then your implementation of operators like >[=>]?
will be mistaken by operator >>
(bit-wise right shift operator and ostream cout) . It was C++ compiler problem from 1999 to 2011 because of not clear decision of the C++ committee. C++ programmer had to add extra space between > >
for template.
Did I mention that I hate C++ with a burning passion? :)
What are the actual lexing rules in this case? Does the lexer need to be "template aware"? Because that's matlab levels of weirdness right there.
Did I mention that I hate C++ with a burning passion? :)
What are the actual lexing rules in this case? Does the lexer need to be "template aware"? Because that's matlab levels of weirdness right there.
I wanted to indicate the problem that might help you. AFAIK, this problem can't be solved in lexical level.
Hi, I am trying to write a C++ lexer with PLY 3.11 and my lexer basically hangs on the following:
Removing 5
t
from the end of the last identifier allows the program to terminate in ~3.7 seconds, and removing just 4t
means allows it to terminate in 7.5 seconds.I am not sure what the tool is doing, but CTRL+C shows its here:
I've never had this issue with PLY so far, so I am really confused. This is my lexer so far. Note I've made sure that all the interesting regex do not match the empty string.