aparrish / gutenberg-dammit

I wanted all of plaintext Project Gutenberg in an easy-to-use format, so I made this
211 stars 14 forks source link

UnicodeDecodeError/NameError when installing from setup.py (Windows-related?) #8

Closed ryanavella closed 5 years ago

ryanavella commented 5 years ago

First of all, I wanted to say that this package is awesome! I have been wanting to interact with the Gutenberg corpus for a while now, but I always ended up running into obstacles and giving up prematurely. I'm glad someone else beat me to the punch!

So I've had some issues getting this package up and running on Windows. I'm running a 64-bit install of Python 3.6.5, for reference.

I initially cloned the repository and then attempted to install from setup.py. This is the output I saw:

>>> python setup.py install
Traceback (most recent call last):
  File "setup.py", line 4, in <module>
    readme = readme_file.read()
  File "C:\Python36\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 10949: character maps to <undefined>

I think that this issue is Windows-related. The open() function on line 3 seems to be defaulting to the console's encoding of Windows-1252 rather than UTF-8. I was able to fix it by specifying the encoding on line 3.

from setuptools import setup

with open('README.md', encoding='UTF-8') as readme_file:
    readme = readme_file.read()
...

After changing line 3, I tried installing again and received the following output:

>>> python setup.py install
Traceback (most recent call last):
  File "setup.py", line 14, in <module>
    packages=setuptools.find_packages(),
NameError: name 'setuptools' is not defined

This issue seems more related to the version of Python I'm running (at least, I doubt it is platform dependent like the UnicodeDecodeError). I was able to fix it by adding an explicit import at the top of the file:

import setuptools
from setuptools import setup
...

I'm willing to submit a pull request with the above changes if they all seem fine to you.

hugovk commented 5 years ago

See #9 to add Linux CI, and fix the second part.

See #10 to add Windows CI, that doesn't fix the first part, but reproduces it.

hugovk commented 5 years ago

See #11 to fix the first part.