Xion / pyqcy

QuickCheck-like testing framework for Python
http://xion.io/pyqcy
Other
41 stars 0 forks source link

Reversing regular expressions #7

Closed Xion closed 12 years ago

Xion commented 12 years ago

We have str_ generator for producing arbitrary strings but this doesn't give enough of a control on the string's form. For more sophisticated (and useful) solution we'd have to implementing regular expression reverse. That is, given particular regex R we want to generate (random) strings that match R in reasonable amount of time, preferably O(n) where n is the size/length/some other measure of the regex.

A quick search doesn't yield any ready-made Python library for that, but this StackOverflow answer seems useful: http://stackoverflow.com/questions/492716/reversing-a-regular-expression-in-python . Other than that, we could also have a look at Django's reverse function.

This is more like a research topic for now but it would nevertheless be very useful to have it included in pyqcy.

Xion commented 12 years ago

Based on the answer linked above, I implemented a simple reverser which walks the regex AST as returned by internal (and not really documented anywhere) re.sre_parse module. So far I managed to support quite significant part of regex syntax, with [^...stuff...] (i.e. inverted character set match) and capture groups backrefs being probably the most notable exceptions. Also, if I'm not terribly mistaken the solution works in O(n) where n is number of nodes in regex AST.

This seems sufficient for 0.4 but of course could be still significantly improved for further versions.