azavea / python-omgeo

OMGeocoder - A python geocoding abstraction layer
http://python-omgeo.readthedocs.io/en/latest/
MIT License
36 stars 14 forks source link

Improve string type checking and compatibility #61

Closed KlaasH closed 5 years ago

KlaasH commented 5 years ago

One of the tests in Cicero, when running under Python 3, is using OMGeo with a PlaceQuery full of bytestring attributes. Which causes a crash:

File "/usr/local/lib/python3.7/dist-packages/omgeo/geocoder.py", line 116, in geocode
candidates, upstream_response_info = gs.geocode(processed_pq)
File "/usr/local/lib/python3.7/dist-packages/omgeo/services/base.py", line 211, in geocode
processed_pq = p.process(processed_pq)
File "/usr/local/lib/python3.7/dist-packages/omgeo/preprocessors.py", line 244, in process
return CancelIfRegexInAttr(regex, ('address', 'query')).process(pq)
File "/usr/local/lib/python3.7/dist-packages/omgeo/preprocessors.py", line 226, in process
if any([self.regex.match(attr) is not None for attr in attrs]):
File "/usr/local/lib/python3.7/dist-packages/omgeo/preprocessors.py", line 226, in
if any([self.regex.match(attr) is not None for attr in attrs]):
TypeError: cannot use a string pattern on a bytes-like object

In the course of trying to resolve the issue, I also ran afoul of the string type checking conditions in CancelIfRegexInAttr.__init__(), which were requiring that the type be in (str, str). That got changed from (str, unicode) in PR #47, and not allowing strings of type unicode is not ideal. So I changed it back to (str, unicode) using a workaround--importing str as unicode in Python 3--that's Py2/Py3 compatible without requiring a dependency on six or future.

That didn't actually fix my original issue, though. For that, I added an except TypeError to try the failing comparison on a decodeed version of the attribute if it fails on the provided version.

I also added a few tests to cover this issue. They might not be fully comprehensive, but they do fail without the code changes (in both Py2 and Py3, for different reasons) and pass with the changes.