Open Panzerschrek opened 8 months ago
https://www.pcre.org/original/doc/html/pcrematching.html#SEC4 A description of the same algorithm in the PCRE library. It is also limited - no backreferences, no submatches, no conditionals, etc.
But it is still may be useful to implement this algorithm for regular expressions without usage of these features.
See https://swtch.com/~rsc/regexp/regexp1.html
This approach works great for long sequences without necessity to rescan input strings multiple times. But it it doesn't seem to work for regular expressions with backreferences.
It's also possible to build a DFA from initial regex NFA and execute it, but it's unclear, how it should work for some cases with greedy search, like
[a-z]*q
, where the longest possible sequence should be found. Also it may be difficult to handle sequences with specified min/max number of iterations.