axiak / pyre2

Python wrapper for RE2
BSD 3-Clause "New" or "Revised" License
295 stars 39 forks source link

Increasing max_mem for Large Regexps #26

Closed nartzmag closed 9 years ago

nartzmag commented 10 years ago

I am trying to build a large regex using pyre2 (100k regexes ~ 2.1 MB in text form) - what are the places in the code that I have to change to handle a larger regex such as this? Is there any easy way to do this besides going through the code and identifying both the places in re2 and pyre2 that need changing? So far i've found these spots:

in the pyre2 package, i have identified these spots where max_mem is set / default param and changed them: https://github.com/axiak/pyre2/blob/master/src/re2.cpp#L8764 https://github.com/axiak/pyre2/blob/master/src/re2.cpp#L8768 https://github.com/axiak/pyre2/blob/master/src/re2.cpp#L10745 https://github.com/axiak/pyre2/blob/master/src/re2.pyx#L762 https://github.com/axiak/pyre2/blob/master/src/re2.pyx#L904

And in the re2 package: https://code.google.com/p/re2/source/browse/re2/re2.h#559 https://code.google.com/p/re2/source/browse/re2/compile.cc#245

Although now a large regex will compile, it will throw the DFA out of memory thing still - which leads me to believe i need to give selectively the DFA more mem - any ideas how to do this?

axiak commented 9 years ago

This is the 3rd option on the compile() function.