refi64 / rejit

A work-in-progress JIT-powered regex engine
Mozilla Public License 2.0
110 stars 4 forks source link

Infinite loop with three repeats following each other #11

Closed refi64 closed 8 years ago

refi64 commented 8 years ago

a+a*a+ will loop indefinitely. The middle repeat can be of any kind.

In addition, (a+)(a+) won't set the groups correctly. The first group will be empty and the second will always have a single a regardless of how many are actually in the string.

refi64 commented 8 years ago

With Python, given aaaa, the first regex will match aaa, `, anda`, and the second will do pretty much the same.

vendethiel commented 8 years ago

Is it really an infinite loop, or is it just backtracking with quadratic complexity?

refi64 commented 8 years ago

@vendethiel So far it seems to be an infinite loop. It also fails if I give it an empty string, so it can't really be any kind if long backtracking. I have a fix in mind for #10, and I'm hoping it'll fix this one, too...

refi64 commented 8 years ago

As of https://github.com/kirbyfan64/rejit/commit/7256b3fb08866cadd05083cb622c6a980874f168, this doesn't infinite loop anymore with the empty string, but it still infinite loops with anything else.

refi64 commented 8 years ago

This will eventually cause a stack overflow. Maybe related to #10...

refi64 commented 8 years ago

Well, this was because I was basically never actually popping any threads off, so they just kind of stayed there. Forever.

The thing with the groups is a completely separate issue.