ProjetPP / PPP-QuestionParsing-Grammatical

Question Parsing module for the PPP using a grammatical approch
GNU Affero General Public License v3.0
33 stars 11 forks source link

Lemmatization #24

Closed Ezibenroc closed 9 years ago

Ezibenroc commented 9 years ago

Lemmatization of words (for instance, are->be, ran->run).

progval commented 9 years ago

Could you add the dependency in setup.py? (in the install_requires section) Also it would be nice to have it download the wordnet, but I can't find a classic way to do that so I might have to look deeper into setuptools to find how to do that

Ezibenroc commented 9 years ago

Could you add the dependency in setup.py? (in the install_requires section)

NLTK? There is no the other dependencies such as jsonrpclib-pelix, so I thought that it was only needed for Travis...

Also it would be nice to have it download the wordnet, but I can't find a classic way to do that so I might have to look deeper into setuptools to find how to do that

python -m nltk.downloader wordnet in a linux shell.

progval commented 9 years ago

Well, jsonrpclib-pelix is missing too and I did not think about adding it.

I know about the command, but I would like to make it possible to avoid it, by just running the setup script.

Ezibenroc commented 9 years ago

nltk.download("wordnet") also works in a Python shell. Could we not add it in the setup.py?

progval commented 9 years ago

Not directly. We have to run it only when it is appropriate (eg. not when the user used --help or to upload a package to PyPI)

Ezibenroc commented 9 years ago
import sys
if 'install' in sys.argv:
    import nltk
    nltk.download("wordnet")
Ezibenroc commented 9 years ago

Big python projects do the the same thing: Django, Pyglet. So maybe there is no cleaner solution...

Ezibenroc commented 9 years ago

Did you find something better? I asked on StackOverflow, but did not have any answer...