uhop / node-re2

node.js bindings for RE2: fast, safe alternative to backtracking regular expression engines.
Other
479 stars 53 forks source link

Some Occurance operators seem to error #206

Closed BannerBomb closed 4 months ago

BannerBomb commented 5 months ago

So if I use a regex like this

new RE2(/((g{2,32}|q){1,32})/iu)

It gives me the error SyntaxError: invalid repetition size: {1,32}

While regular regex allows it to work.

I'm not sure if this is a bug or if it was intended. Just found it weird. And yes, I know a regex like this isn't really good practice as the {2,32} would be irrelevant.

uhop commented 4 months ago

It looks like RE2's code has a hardcoded limit for a maximum repeat count:

Currently, it is set to 1000. It is used for some internal needs. See comments:

While there is a procedure to change the limit, it is unclear how it affects RE2. The available comments suggest that it should be decreased, not increased.

In your example, we have 32 * 32 = 1024, which is greater than 1000.

I don't see how we can handle it on our side. If this restriction is important, don't hesitate to contact google/re2.

uhop commented 4 months ago

I probably should add that it looks like this restriction is used only by a parser and it checks bound repetition operators and ignores unbound ones: ?, +, *, {5,}, and similar ones. It will check both minimal and maximal repetition values if possible.

The parser calculates values recursively so if there are embedded repetitions (like in your example) they will be properly multiplied.