kean / Regex

Open source regex engine
210 stars 17 forks source link

endless recursion on evil regexp #3

Open yiminghe opened 3 years ago

yiminghe commented 3 years ago

try

let regex = try Regex("(a*)*b\\1")
regex.isMatch("ac");

I try to fix this in my js version by recording state, current input index and cache its match result: https://github.com/yiminghe/kison/blob/2001ac972e77251e0e6ee00a8e7fee126765fa4b/examples/regexp-ll/src/match.js#L3, what's your opinion?

PS: I know it's created for learning purposes, this issue is only for discussion.

kean commented 3 years ago

I haven't worked on this project in a while. Yeah, that's why I dislike non-regular parts of regexes such as backreferences. I haven't focused on them too much and only added the basic support using a simple backtracking algorithm. I use a different algorithm for regular regexes that you can find in the codebase. It handles these types of regexes with ease (as long as you don't use backreferences or lazy quantifiers for which it falls back to backtracking)

I'm sure there are ways to add some defensive options to prevent this issue from happening, but it's hard to reason about, and I'm not sure I'll be to be able to look into it now, sorry.