axiak / pyre2

Python wrapper for RE2
BSD 3-Clause "New" or "Revised" License
295 stars 39 forks source link

Add ability to choose regex encoding type #38

Open spender-sandbox opened 8 years ago

spender-sandbox commented 8 years ago

Would it be possible to add support for Latin-1 encoding of regexes? Currently the re2 module can't be used as a drop-in replacement of 're' for pretty simple regex searches of binary data because of it forcing UTF-8 encoding on the regexes. https://github.com/axiak/pyre2/blob/master/src/re2.pyx#L950

RE2 itself supports it, it would just be a matter of adding the option to pass _re2.EncodingLatin1 there.

Thanks! -Brad