ishiland / python-geosupport

Python bindings for NYC Geosupport Desktop
MIT License
17 stars 11 forks source link

Unicode error when running on linux OS #2

Closed aepyornis closed 6 years ago

aepyornis commented 6 years ago

First off, thanks for creating this library.

There is an issue with file encoding that occurs when running on linux (or at least on debian).

The package installs fine - both using git and pypi -- but running import geosupport raises this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/geosupport/venv/lib/python3.6/site-packages/geosupport/__init__.py", line 1, in <module>
    from .geosupport import Geosupport
  File "/tmp/geosupport/venv/lib/python3.6/site-packages/geosupport/geosupport.py", line 5, in <module>
    from .function_info import FUNCTIONS, function_help, list_functions, input_help
  File "/tmp/geosupport/venv/lib/python3.6/site-packages/geosupport/function_info.py", line 205, in <module>
    FUNCTIONS = load_function_info()
  File "/tmp/geosupport/venv/lib/python3.6/site-packages/geosupport/function_info.py", line 35, in load_function_info
    for row in csv:
  File "/usr/lib/python3.6/csv.py", line 111, in __next__
    self.fieldnames
  File "/usr/lib/python3.6/csv.py", line 98, in fieldnames
    self._fieldnames = next(self.reader)
  File "/usr/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 496: invalid start byte

The CSV files located in function_info/ are encoded with ISO-8859-1, but python on Linux appaers to assume by default that they are UTF-8 and raises an error.

I did fix the issue by converting the files to UTF-8 via this shell command:

 find geosupport/function_info/ -type f -name '*.csv' | xargs -I FILE sh -c "iconv -t UTF-8 -f ISO-8859-1 FILE  > tmp && mv tmp FILE"

however, perhaps a solution should be incorporated into the program?

ishiland commented 6 years ago

Hey, I was able to recreate the issue here using docker. I think I will use something like this moving forward for cross-platform testing. I didn't go much further than that but I’ll try and take a deeper look later this week. @docmarionum1 may have some insight into this as well. In the meantime if you make any progress let me know or submit a pull request. Thanks!

aepyornis commented 6 years ago

Great! I'll take a look too and see if I can figure out a good solution. Being able to test across-platform would be helpful of course. I wouldn't want to make a change that fixes it for linux and breaks mac or windows.

docmarionum1 commented 6 years ago

Yikes. That's my fault. I copied over a bunch of bad characters from the geosupport docs.

Quick fix: I converted all the files to utf-8 using the command that @aepyornis gave. @ishiland see #3

I also had to make a change to the way that the geosupport library is being loaded for it to work for me. I tested it on RHEL 7.5.

I've used Travis before and could set up CI for Linux. But it doesn't support Windows. Appveyor looks like an option but I'm not familiar with it.

https://www.appveyor.com/

ishiland commented 6 years ago

@docmarionum1 that sounds good, ill investigate appveyor. If all goes well, the latest PR should be on pypi within the next few days.

aepyornis commented 6 years ago

@ishiland @docmarionum1 Thanks!