aparrish / pycorpora

A simple Python interface for Darius Kazemi's Corpora Project.
MIT License
119 stars 24 forks source link

Interface not working? #18

Closed zachwhalen closed 6 years ago

zachwhalen commented 6 years ago

Hi,

I was just trying to get this running and I found a couple of issues. It is possible that these are not endemic to the code but rather a consequence of some external idiosyncracies, but I will share nonetheless.

First, install through pip didn't seem to download the data from Corpora. So pycorpora.get_categories() just returned the test data.

However, installing manually with setup.py and the appropriate --corpora-zip-url worked fine.

Similarly, once imported, pycorpora didn't respond with methods(?) following the various categories; instead I had to pull the file individually

In this case, I wanted some first names. This didn't work (AttributeError: 'module' object has no attribute 'humans'): names = pycorpora.humans.firstnames["firstNames"]

But this did: pycorpora.get_file("humans","firstnames")['firstNames']

On balance, this is still far more convenient than downloading and parsing it all myself, so I provide this in case it helps someone else with similar problems in the future.

hugovk commented 6 years ago

Both are working for me with Python 2.7.14 and Python 3.6.3 on macOS High Sierra, using this pip command from the README.

Python 3

61% [hugo:/tmp] 4s % pip install pycorpora --install-option="--corpora-zip-url=https://github.com/dariusk/corpora/archive/master.zip"
/usr/local/lib/python3.6/site-packages/pip/commands/install.py:194: UserWarning: Disabling all use of wheels due to the use of --build-options / --global-options / --install-options.
  cmdoptions.check_install_build_global(options)
Collecting pycorpora
  Using cached pycorpora-0.1.2.tar.gz
Skipping bdist_wheel for pycorpora, due to binaries being disabled for it.
Installing collected packages: pycorpora
  Running setup.py install for pycorpora ... done
Successfully installed pycorpora-0.1.2
⌂63% [hugo:/tmp] 12s % python
Python 3.6.3 (default, Oct  4 2017, 06:09:15)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pycorpora
>>> names = pycorpora.humans.firstnames["firstNames"]
>>> names[0]
'Aaliyah'
>>> names2 = pycorpora.get_file("humans","firstnames")['firstNames']
>>> names2[0]
'Aaliyah'

Python 2

[hugo:/tmp] 4s % pip2 install pycorpora --install-option="--corpora-zip-url=https://github.com/dariusk/corpora/archive/master.zip"
/usr/local/lib/python2.7/site-packages/pip/commands/install.py:194: UserWarning: Disabling all use of wheels due to the use of --build-options / --global-options / --install-options.
  cmdoptions.check_install_build_global(options)
Collecting pycorpora
  Using cached pycorpora-0.1.2.tar.gz
Skipping bdist_wheel for pycorpora, due to binaries being disabled for it.
Installing collected packages: pycorpora
  Running setup.py install for pycorpora ... done
Successfully installed pycorpora-0.1.2
[hugo:/tmp] 10s % python2
Python 2.7.14 (default, Sep 25 2017, 09:53:22)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pycorpora
>>> names = pycorpora.humans.firstnames["firstNames"]
>>> names[0]
u'Aaliyah'
>>> names2 = pycorpora.get_file("humans","firstnames")['firstNames']
>>> names2[0]
u'Aaliyah'

I wonder what's different for you? I did pip install -U pip setuptools wheel first for both, it might be worth trying that.

zachwhalen commented 6 years ago

Well, that worked. Huh.