bcicen / wikitables

Import tables from any Wikipedia article as a dataset in Python
MIT License
292 stars 34 forks source link

Add try catch to avoid unicode decode exception in python 2.7.10 #10

Closed lucasSimonelli closed 7 years ago

lucasSimonelli commented 7 years ago

Hi, I tested your lib under Python 2.7.10, and was getting a crash with the following code: from wikitables import import_tables tables = import_tables('List of postal codes') Wiki link: https://en.wikipedia.org/wiki/List_of_postal_codes

I'm guessing the error could be arising for some weird chars in the country names (e.g. Åland Islands). The code is trying to convert to unicode chars that already are in unicode. I just wrapped the erroring line to catch the exception and avoid the unicoding in that case.

Let me know if this helps.

Thanks!

bcicen commented 7 years ago

Thanks @lucasSimonelli! I think this is a reasonable solution for now; merged.