aboSamoor / polyglot

Multilingual text (NLP) processing toolkit
http://polyglot-nlp.com
Other
2.31k stars 337 forks source link

Unable to install polyglot-16.7.4 on macOS Sierra #100

Open mdmadhu opened 7 years ago

mdmadhu commented 7 years ago

Haven't been able to figure why I am getting the UnicodeDecodeError... have tried generic fixes for the error messages to no avail. Any idea what may be happening? Thanks!

Collecting polyglot
  Using cached polyglot-16.7.4.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/private/var/folders/_1/4bq08bdj03s_bv8s4bwjmbc00000gn/T/pip-build-3md5xw2d/polyglot/setup.py", line 15, in <module>
        readme = readme_file.read()
      File "/Users/mdmadhusudan/anaconda/lib/python3.5/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2330: ordinal not in range(128)

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/_1/4bq08bdj03s_bv8s4bwjmbc00000gn/T/pip-build-3md5xw2d/polyglot/
saxelsen commented 7 years ago

I am getting the exact same error.

csoni111 commented 7 years ago

This is because the polyglot version 16.7.4 contains with open('README.rst') as readme_file: so is it is being read in ascii format (default). But as the the file README.rst has non-ascii chars so you get that error!

You could solve the above problem by directly fetching the repo from github (master) instead of pypi as they have resolved this on the current master in commit 7ba2610. To directly fetch from github use this: pip install -U git+https://github.com/aboSamoor/polyglot.git@master