vladimarius / pyap

Python address detector and parser
MIT License
200 stars 60 forks source link

fixed re flags #2

Closed KCzar closed 5 years ago

KCzar commented 8 years ago

There was a bug in _normalize_string. It used to be:

text = re.sub(find, replace, text, re.UNICODE)

But the syntax for re.sub is:

re.sub(pattern, repl, string, count=0, flags=0)¶

So re.UNICODE was being entered as the count parameter rather than as a flag. This was causing problems when testing on long strings containing many addresses with newlines in between. It was only extracting the first 14 or so addresses.

I find it is best practice to enter flags as a keyword arguments when using regex, as it is easy to forget or mix up the exact syntaxes, so I entered all flags as kwargs.